curl-users
Re: curl/python -- stderr
Date: Fri, 31 Mar 2017 09:50:06 -0500
On 3/31/17 9:33 AM, bruce wrote:
> On Thu, Mar 30, 2017 at 9:57 PM, Tony Aiuto <tony.aiuto_at_gmail.com> wrote:
>>
>> On Thu, Mar 30, 2017 at 2:03 PM, bruce <badouglas_at_gmail.com> wrote:
>>> Hi.
>>>
>>> I know this is a bit off topic. It's python + curl...!
>>>
>>>
>>> Trying to understand the "correct" way to run a sys command ("curl")
>>> and to get the potential stderr. Checking Stackoverflow (SO), implies
>>> that I should be able to use a raw/text cmd, with "shell=true".
>>
>> Never use "shell=true". You want to specify the command to Popen as a list.
>> cmd=['curl', '-sS', '-A', 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT
>> 6.1; Trident/5.0; yie8)' , ......]
>> p = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
>> s_out, s_err = p.communicate()
>>
> Hey Tony...
>
> Thanks for the reply... However, you're wrong.. using the "shell=True"
> seems to have to do with os cmds, vs programse, and how the different
Err no, that is not correct. using shell=True sends the command to your
system shell. On Windows afaik this is just cmd.exe. On POSIX and Linux,
I believe it invokes whatever shell is in the environment variable
$SHELL. There is nothing to do with "os cmds". In either case,
subprocess *is* invoking an os command. More reading here:
http://stackoverflow.com/a/3172488/3784644,
https://docs.python.org/2/library/subprocess.html#using-the-subprocess-module
> shells interpret the commands.. Stack Overflow has a number of diff
> questions that get into some of the issues.
>
> In my case the following now works fine -- notice the (s,e)=proc.communicate()
>
> cmd='curl -sS '
> #cmd=cmd+'-A "Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
> Firefox/38.0"'
> cmd=cmd+"-A '"+user_agent+'"'
> ##cmd=cmd+' --cookie-jar '+cname+' --cookie '+cname+' '
> cmd=cmd+' --cookie-jar '+ff+' --cookie '+ff+' '
> #cmd=cmd+'-e "'+referer+'" -d "'+tt+'" '
> #cmd=cmd+'-e "'+referer+'" '
> cmd=cmd+'-L "'+url1+'"'
>
> try_=1
> while(try_):
> proc=subprocess.Popen(cmd,
> shell=True,stdout=subprocess.PIPE,stderr=subprocess.PIPE)
> (s,err)=proc.communicate()
> s=s.strip()
> err=err.strip()
>
> if(err==0):
> try_=''
>
> --------------------------------
>
>
>>>
>>> If I leave the stderr out, and just use
>>> s=proc.communicate()
>>> the test works...
>>>
>>> Any pointers on what I might inspect to figure out why this hangs on
>>> the proc.communicate process/line??
>>>
>>> I'm showing a very small chunk of the test, but its the relevant piece.
>>>
>>> Thanks
>>>
>>>
>>> .
>>> .
>>> .
>>>
>>> cmd='[r" curl -sS '
>>> #cmd=cmd+'-A "Mozilla/5.0 (X11; Linux x86_64; rv:38.0)
>>> Gecko/20100101 Firefox/38.0"'
>>> cmd=cmd+"-A '"+user_agent+"'"
>>> ##cmd=cmd+' --cookie-jar '+cname+' --cookie '+cname+' '
>>> cmd=cmd+' --cookie-jar '+ff+' --cookie '+ff+' '
>>> #cmd=cmd+'-e "'+referer+'" -d "'+tt+'" '
>>> #cmd=cmd+'-e "'+referer+'" '
>>> cmd=cmd+"-L '"+url1+"'"+'"]'
>>> #cmd=cmd+'-L "'+xx+'" '
>>>
>>> try_=1
>>> while(try_):
>>> proc=subprocess.Popen(cmd,
>>> shell=True,stdout=subprocess.PIPE,stderr=subprocess.PIPE)
>>> s,err=proc.communicate()
>>> s=s.strip()
>>> err=err.strip()
>>>
>>> if(err==0):
>>> try_=''
>>>
>>> .
>>> .
>>> .
>>>
>>> the cmd is generated to be:
>>> cmd=[r" curl -sS -A 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT
>>> 6.1; Trident/5.0; yie8)' --cookie-jar
>>> /crawl_tmp/fetchContentDir/12f5e67c_156e_11e7_9c09_3a9e85f3c88e.lwp
>>> --cookie
>>> /crawl_tmp/fetchContentDir/12f5e67c_156e_11e7_9c09_3a9e85f3c88e.lwp
>>> -L
>>> 'http://www6.austincc.edu/schedule/index.php?op=browse&opclass=ViewSched&term=216F000&disciplineid=PCACC&yr=2016&ct=CC'"]
>>>
>>>
>>>
>>>
>>> test code hangs, ctrl-C generates the following:
>>> ^CTraceback (most recent call last):
>>> File "/crawl_tmp/austinccFetch_cloud_test.py", line 3363, in <module>
>>> ret=fetchClassSectionFacultyPage(a)
>>> File "/crawl_tmp/austinccFetch_cloud_test.py", line 978, in
>>> fetchClassSectionFacultyPage
>>> (s,err)=proc.communicate()
>>> File "/usr/lib64/python2.6/subprocess.py", line 732, in communicate
>>> stdout, stderr = self._communicate(input, endtime)
>>> File "/usr/lib64/python2.6/subprocess.py", line 1328, in _communicate
>>> stdout, stderr = self._communicate_with_poll(input, endtime)
>>> File "/usr/lib64/python2.6/subprocess.py", line 1400, in
>>> _communicate_with_poll
>>> ready = poller.poll(self._remaining_time(endtime))
>>> KeyboardInterrupt
>>>
>>>
>>>
>>> This works from the cmdline:
>>> curl -sS -A 'Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1;
>>> Trident/5.0; yie8)' --cookie-jar
>>> /crawl_tmp/fetchContentDir/12f5e67c_156e_11e7_9c09_3a9e85f3c88e.lwp
>>> --cookie
>>> /crawl_tmp/fetchContentDir/12f5e67c_156e_11e7_9c09_3a9e85f3c88e.lwp
>>> -L
>>> 'http://www6.austincc.edu/schedule/index.php?op=browse&opclass=ViewSched&term=216F000&disciplineid=PCACC&yr=2016&ct=CC'
>>
>> -----------------------------------------------------------
>> Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users
>> Etiquette: https://curl.haxx.se/mail/etiquette.html
>>
> -----------------------------------------------------------
> Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users
> Etiquette: https://curl.haxx.se/mail/etiquette.html
-- Nicholas Chambers Technical Support Specialist nchambers_at_lightspeedsystems.com 1.800.444.9267 www.lightspeedsystems.com ----------------------------------------------------------- Unsubscribe: https://cool.haxx.se/list/listinfo/curl-users Etiquette: https://curl.haxx.se/mail/etiquette.htmlReceived on 2017-03-31