You are here

Struggling to configure for grabbing Radio Times (UK) data

10 posts / 0 new
Last post
londc3
Offline
Donator
Joined: 1 year
Last seen: 2 days
Struggling to configure for grabbing Radio Times (UK) data

Hi,

 

I'm running 5.5 on Windows 11 - a new installation intended to be configured like the installation on my old PC, which was working fine using an earlier version of WGP.  The new version downloads the up-to-date site.ini pack if it detects that the one installed is not current.  My issue is that I cannot get my old configuration working in the new environment.  Specifically, I cannot get Radio Times (UK) data.  Everything I have tried results in:

no shows in indexpage!

   Summary for update of       5
     no changes, no update necessary !
     unchanged shows inspected 0
     total after update        0

I have tried the channel entries from the old configuration, the channel entries in the various radiotimes.com.channels files in the site.ini pack and I have also twice tried using the "c2" method to create a Provider-Regions xml (one with "Popular channels" and one with "Freeview") and copied the channel entries from that to my config.  All are getting the same result.  Obviously I'm doing something wrong!  Could anyone please help me work out what it is?

Thanks,
Dan

 

Attachments: 
londc3
Offline
Donator
Joined: 1 year
Last seen: 2 days

Update:  I tried running the stable and previously working configuration on my old PC (WGP 5.3, Windows 10) and got the same negative results as I'm getting on the new PC.  As far as I can tell this is with an unchanged config file from when it was last run (about 12 days ago) and unchanged site.ini pack.  Could it be that Radio Times have changed their site, breaking both old and current versions of the Radio Times site.ini files?

Thanks,
Dan

 

mat8861
Offline
WG++ Team memberDonator
Joined: 10 years
Last seen: 1 hour

you cannot use the old channel lines. In the new siteini (version 27 by Blackbear199) https://github.com/SilentButeo2/webgrabplus-siteinipack/blob/master/site...

is described what to do, basically 4 list C1 (dummy channel line),C2 (dummy channel line), C3(a mix of lines from C1&C2), C4 (for radio)

 

according to your config your new channels are

<channel update="i" site="radiotimes.com" site_id="tv##default##hnq7@@hnrv##BBC One London" xmltv_id="BBC One London">BBC One London</channel>
<channel update="i" site="radiotimes.com" site_id="tv##default##hnq7@@hnrv##BBC Two England" xmltv_id="BBC Two England">BBC Two England</channel>
    <channel update="i" site="radiotimes.com" site_id="tv##default##hnq7@@hnrv##ITV1 London" xmltv_id="ITV1 London">ITV1 London</channel>
    <channel update="i" site="radiotimes.com" site_id="tv##default##hnq7@@hnrv##Channel 4" xmltv_id="Channel 4">Channel 4</channel>
    <channel update="i" site="radiotimes.com" site_id="tv##default##hnq7@@hnrv##5" xmltv_id="5">5</channel>
 

**Note in the log there are "404 not found this is because the main show doesn't have details

londc3
Offline
Donator
Joined: 1 year
Last seen: 2 days

Thanks for this.  I had some difficulty following the instructions (can post on this later) but I did manage to grab some Radio Times data using entries from the radiotimes.com.channels.london.xml file that someone kindly left in the UK folder.  One issue remains - I only get 1 day's worth of data even though I have set   <timespan>7</timespan>

The log in the attached zip seems to indicate that I have set the timespan correctly as it contains the following line:

[  Info  ] update requested for - 68 - out of - 68 - channels for 8 day(s)

(Note that the log was from my second attempt on the same day, so most of the data was unchanged from the first attempt).

Although it's large I have attached a truncated but sizeable chunk of the produced guide.xml file so you can see that no shows with start date beyond 20251208 are included.  The size of the guide.xml also indicates this - I normally find a full week's worth of the Radio Times data creates a file of around 17MB instead of the 3.6MB produced here.

So, can anyone point me to what I need to change to get more than a day's data?

Thanks,
Dan

 

 

Attachments: 
mat8861
Offline
WG++ Team memberDonator
Joined: 10 years
Last seen: 1 hour

Hello mate, first let me suggest you, to use the settings we recommend in the sample here:

https://github.com/SilentButeo2/webgrabplus-siteinipack/blob/master/site...

in particular the mode and user agent tag. Then set timespan to 6 (7days) as webgrab use 0 base counter, meaning 0=1 day

Last not least, I uploaded a new revision 28, so update your siteini.pack. You can force update by deleting the text file (siteini.pack_2025.xx.xx_xxxxxx.txt ) in siteini.pack root folder.

Please note some channels do not have show details, this is causing the "error downloading page: Response status code does not indicate success: 404 (Not Found)" but as you understand is a site issue. Also it seems that radiotimes is based on 7 days epg, i do not suggest to use timespan higher then 6.

Blackbear199
Offline
Has donated long time ago
Joined: 10 years
Last seen: 4 months

your never going to fix it if you dont understand the problem.
about V5.4 when postbackmodepipi was added the showsplit broke.
i noticed it and mentioned it and it fell on deaf ears and it hasnt even been looked at as far as i know since.
heres the problem.
as you know radiotimes returns json data.
now you could do the show split with regex and .*? ect. and split it.
i call this a poor mans showsplit because thats exactly what it is,your leaving yourself wide open for it to fail or not work correctly.
all they have to do is change one thing or even move it and your regex fails.
i prefer todo it just as a json parser would.
1. split the data into channel groups
2. select the channel group you want
3. split the channel group into individual shows.
this way it doesnt matter if they change the data structure,you not using it to split it.

scub alll the data
index_showsplit.scrub {regex||^.*$||}
guess what you will see?
timespan = 0 all ok.
timespan = 1 all ok.
timespan = >1 not ok.
after the 2nd day instead of the data being in one chuck you will see its separated by 3 line feeds(\n\n\n)
index_showsplit.scrub {regex||^\[(.*?)\](?:\n\|$)||}
this works for today and tomorrow but fails for everyday after that.
thats why you only get epg for 2 days.
for some reason these line feeds cannot be scrubbed,i suspect we see them as line feeds but webgrab store them internally differently like it uses !??! for the |.
the solution..
fix the problem instead of using band aid fixes to try to work around it.

 

 

londc3
Offline
Donator
Joined: 1 year
Last seen: 2 days

Thanks, that seems to have worked.  I had this working fine on my old PC with a (rather optimistic!) timespan of 15 - the intent being to just capture everything they might publish.  But I wasn't surprised that I never got more than 8-9 days' worth of listings.

Is the c1, c2, c3 mechanism explained anywhere in the documentation?  I'd like to understand it a bit better...

Cheers,
Dan

 

 

mat8861
Offline
WG++ Team memberDonator
Joined: 10 years
Last seen: 1 hour

Yes it is in documentation. A further help comes from siteini headers, in particular Blackbear199 described how to get channel list step by step.

The concept is basically run C1 with a dummy channel line, then with one line from the risult of C1  list, run C2. Again with a channel line from C2 list, you run C3. This concept has anyway some variant depending on siteini, therefore we strongly suggest to look the siteini headers.

londc3
Offline
Donator
Joined: 1 year
Last seen: 2 days

A post-script to the above conversation (and thanks, all, for your help with that):  I've just noticed that I no longer get star-rating in the output.

Looking back through old guide.xml files I see that the ones from before early December 2025 (that is, the time when I moved to a different PC with the newer version of WGP) include star-rating data.  However, the ones from after I got the new setup working do not.  I only grab Radio Times data so I don't know if this is specific to Radio Times.

If this is a config issue at my end, any pointers to what I should try in order to get the star-ratings data back?  Or is it more likely to be related to the Radio Times site.ini?

Thanks,
Dan

 

mat8861
Offline
WG++ Team memberDonator
Joined: 10 years
Last seen: 1 hour

No worries, format was changed, it's fixed now with rev.30. Please update siteini.

Attachments: 
Log in or register to post comments

Brought to you by Jan van Straaten

Program Development - Jan van Straaten ------- Web design - Francis De Paemeleere
Supported by: servercare.nl