You are here

tvtv.us.robots' is denied.

24 posts / 0 new
Last post
TimTech
Offline
TimTech's picture
Joined: 6 months
Last seen: 2 months
tvtv.us.robots' is denied.

My WG++ stop functioning and now I get a "Access to the path 'C:\Users\Tim\AppData\Local\WebGrab+Plus\robots\tvtv.us.robots' is denied." I would think it is denied since it is supposed to be a read only file as of the rev 4 of the tvtv.us.ini file that states "need in folder \WebGrab+Plus\robots\tvtv.us.robots to be read-only and with only 2 lines:
* User-agent: *
* User-agent: WebGrab+Plus

This was working last week and now it isn't. Any insight?

Please be kind. I'm a newbie.

mat8861
Offline
WG++ Team memberDonator
Joined: 5 years
Last seen: 1 day

Problem is as registered user you are allowed 30 channels (you have 92), with a small donation you can run up to 250 with all details. Other solution is run 30 every time but again will be only index and no details. Check your license log and also have a look here: http://www.webgrabplus.com/content/support-us

TimTech
Offline
TimTech's picture
Joined: 6 months
Last seen: 2 months

Thanks for the help. What doesn't make sense is that I was able to glean 30 channels (my limit) before this week without the "Access to the path 'C:\Users\Tim\AppData\Local\WebGrab+Plus\robots\tvtv.us.robots' is denied." error. Why suddenly do I get the access denied to the robots file and no updates at all?
I'll try trimming my channels to 30 and see what happens.

TimTech
Offline
TimTech's picture
Joined: 6 months
Last seen: 2 months

I shaved my list down to 30 channels. Just like I thought, I'm still getting the "access denied" error.
It seems as though tvtv.us.ini needs updating. Anyone else having issues with this site?

Attachments: 
mat8861
Offline
WG++ Team memberDonator
Joined: 5 years
Last seen: 1 day

tvtv.robots needs to be read only by the user running webgrab and modified with the 2 lines. there are lots of post about. It works fine.

TimTech
Offline
TimTech's picture
Joined: 6 months
Last seen: 2 months

Attached is a copy of my tvtv.us.robots file with a txt extension so it could be uploaded. It is a read only file in my \AppData\Local\WebGrab+Plus\robots folder. This grab worked last week and not now. I haven't changed anything in between.
By the way, there is no underscore in my file name in the folder. The upload must have put that in.

Attachments: 
mat8861
Offline
WG++ Team memberDonator
Joined: 5 years
Last seen: 1 day

looks like you have extra empty lines check here: http://webgrabplus.com/comment/19363#comment-19363 if yoy run 10 channels does it work ?

TimTech
Offline
TimTech's picture
Joined: 6 months
Last seen: 2 months

I shaved my list to 10 channels, deleted the tvtv.us.robots file, ran WG, modified the robots file as attached (read only), ran WG again and still got the access denied error. If the robots file is not read only, the file gets overwritten so I know WG has access when not read only.
Thanks for helping me out. That other thread you linked is very similar and thought it would fix my issue. The extra lines in the uploaded robots file are anomalies of the upload process. They aren't in the actual file. Are you using tvtv.us successfully?

mat8861
Offline
WG++ Team memberDonator
Joined: 5 years
Last seen: 1 day

yes works fine fine here. As you can see my robots was created/modified and last accessed on 1 Jan 2020. Attributes is read-only.Tested also channel creation (New York lu2381d) also works fine.

TimTech
Offline
TimTech's picture
Joined: 6 months
Last seen: 2 months

This is how I fixed it for now:
Deleted tvtv.us.robots
Deleted hot_cookies.txt
Re-ran WG with my 30 channel list.
Changed two "Disallow" to "Allow" (results shown below) in newly created tvtv.us.robots file.
User-agent: *
Allow: /tvm/
Disallow: /gn/
User-agent: WebGrab+Plus
Allow: /

and DID NOT make it read only.

Ran WG and it is pulling channels as I write. I'll post log after it finishes.

TimTech
Offline
TimTech's picture
Joined: 6 months
Last seen: 2 months
mat8861 wrote:

yes works fine fine here. As you can see my robots was created/modified and last accessed on 1 Jan 2020. Attributes is read-only.Tested also channel creation (New York lu2381d) also works fine.

I noticed you are running rev 4 of tvtv.us.ini whereas I am running rev 5
found: C:\Users\mat88\AppData\Local\WebGrab+Plus\siteini.user\USA\tvtv.us.ini -- Revision 04
vs
found: C:\Users\Tim\AppData\Local\WebGrab+Plus\tvtv.us.ini -- Revision 05
Maybe that is where our methods/results differ.

mat8861
Offline
WG++ Team memberDonator
Joined: 5 years
Last seen: 1 day

Is the same ini....i didn't change revision on my copy.

TimTech
Offline
TimTech's picture
Joined: 6 months
Last seen: 2 months

So, tvtv.us ran without the robots file being read only. Attached is the log

Attachments: 
xdetoursx
Offline
Donator
Joined: 4 years
Last seen: 3 weeks

I just started having this issue with TVTV.CA the other day. Was working fine prior to that.

I tried both methods and neither one works. If I update the robots file with "Allow" it automatically gets reverted back to "Disallow" when WG is run

If I change it to Read Only and run, then I get an access denied error. Access to the path 'C:\Users\XXXXX\AppData\Local\WebGrab+Plus\robots\tvtv.ca.robots' is denied.

Either way, can't get it to scrape.

TimTech
Offline
TimTech's picture
Joined: 6 months
Last seen: 2 months
xdetoursx wrote:

If I update the robots file with "Allow" it automatically gets reverted back to "Disallow" when WG is run
If I change it to Read Only and run, then I get an access denied error. Access to the path 'C:\Users\XXXXX\AppData\Local\WebGrab+Plus\robots\tvtv.ca.robots' is denied.
Either way, can't get it to scrape.

This is what my working robots file looks like (.txt added so it would upload to this forum). Just noticed there is a Disallow still in there. Did you delete the hot_cookies.txt file as well. Not sure if that makes a difference or not...

mat8861
Offline
WG++ Team memberDonator
Joined: 5 years
Last seen: 1 day

Don't know why works fine for me. My user-agent :
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36 Edg/79.0.309.71
my robot folder is read only
my tvtv.ca.robots attached is also read only.
If it works for me must work for you guys

TimTech
Offline
TimTech's picture
Joined: 6 months
Last seen: 2 months

Might be our OS. Mine is Windows7 Pro, EPGFreak is Windows 2019 Server. Don't know what OS xdetoursx is using.
I thought the robots file (not folder) is supposed to be read only.

mat8861
Offline
WG++ Team memberDonator
Joined: 5 years
Last seen: 1 day

Only doubt is winserver...but if you run as admin shouldn't make difference...of course access denied seems a security problem.I would check permission on folder/file for the user that runs wg++

xdetoursx
Offline
Donator
Joined: 4 years
Last seen: 3 weeks

Awesome. Got it working. Deleted my old robots file then made a new one like yours mat and saved (not Read Only though) and it's working

Also changed the permissions on the file.

Thanks

TimTech
Offline
TimTech's picture
Joined: 6 months
Last seen: 2 months
mat8861 wrote:

yes works fine fine here. As you can see my robots was created/modified and last accessed on 1 Jan 2020. Attributes is read-only.Tested also channel creation (New York lu2381d) also works fine.

Matt,
I noticed your output contains items my output doesn't have although we are supposedly using the same tvtv.us.ini file.
An example is your Disney Eastern Feed:

programme start="20200719041000 +0000" stop="20200719043500 +0000" channel="Disney - Eastern Feed">
title lang="en">Jessie
sub-title lang="en">Help Not Wanted
desc lang="en">When Jessie needs some extra money for a gift, she accepts a job out of desperation.
credits>
actor>Debby Ryan
actor>Kevin Chamberlin
actor>Peyton List
/credits>
category lang="en">Sitcom
icon src="http://www.webgrabplus.com/%3Ca%20href%3D"https://cdn.tvpassport.com/image/show/480x720/68706.jpg">https://cdn.tvpassport.com/image/show/480x720/68706.jpg"/>
episode-num system="onscreen">S3 E14
rating system="US">
value>TVG
/rating>
/programme>

My output only has title, sub-title, and desc lang (example of a Disney Eastern Feed entry):

programme start="20200727025500 +0000" stop="20200727032000 +0000" channel="Disney - Eastern Feed">
title lang="en">Gabby Duran and the Unsittables
sub-title lang="en">Tailoring Swift
desc lang="en">When Gabby gets a bad review from a babysitting client, she suspects they may be up to no good.(n)
/programme>

Did you change something to extract the extra entries like category, actor, rating, etc?

Of my entire 30 channels, I don't have anything other than title, sub-title, and desc lang

Thanks again for your insight.

mat8861
Offline
WG++ Team memberDonator
Joined: 5 years
Last seen: 1 day
TimTech
Offline
TimTech's picture
Joined: 6 months
Last seen: 2 months
mat8861 wrote:

I'm using the one in siteini pack ....rev 5
https://github.com/SilentButeo2/webgrabplus-siteinipack/blob/master/site...

Another head scratcher. I'm using the unmodified ini from the siteini.pack as well. I wonder if our different output is because of our webgrab++.config.xml using a different site id?
An example of mine is:
channel update="i" site="tvtv.us" site_id="6392D/48" xmltv_id="Discovery Channel (US) - Eastern Feed">Discovery Channel (US) - Eastern Feed

mat8861
Offline
WG++ Team memberDonator
Joined: 5 years
Last seen: 1 day

i use 2381D/278 ..here you go they look the same, of course not all shows have complete info as actor,director,etc etc
are you using rev 5?

Attachments: 
TimTech
Offline
TimTech's picture
Joined: 6 months
Last seen: 2 months

Yep, using Rev 5 of the ini

Attachments: 
Log in or register to post comments

Brought to you by Jan van Straaten

Program Development - Jan van Straaten ------- Web design - Francis De Paemeleere
Supported by: servercare.nl