You are here

Updated freeview.com.au Australian EPG scraper

7 posts / 0 new
Last post
i286
Offline
Joined: 7 years
Last seen: 5 years
Updated freeview.com.au Australian EPG scraper

Hi all, I finally registered to say I have created an INI for the newly updated freeview.com.au site.

The new site is unlike anything the old one and the old freeview.com.au INI by Blackbear199 had no chance any more.

My INI stripts the production year from the description and puts it correctly into the production year field.
The rest of the content is there, and I have not noticed any errors as of yet.

So I created this from scratch and I do intend to do 2 updates to it when I can get around to it.
1. Currently the episode listings are only in xmltv standard for use with NextPVR, however I will add the section for standard use too.
2. Release a small powershell/vbs utility that reads the freeview xlsx automatically and updates the site_id's in WebGrab++.config.xml in case there is any changes

Point 2 is only because on one occasion since the new site came about, I have found one channel switching its ID unknown to me for a little bit.

Anyways process  listed in file is outdated, use the below method if have excel/excel equivalent:
1. Get channel list from http://freeviewcdn.azureedge.net/XML/config.xlsx (Freeview site)
2. On services tab, insert an additional column after column F 'Region', so lets say new column will be G
3. In column G, row 2 paste without square brackets [="<channel update=" & CHAR(34) & "f" & CHAR(34) & " site=" & CHAR(34) & "freeview.com.au" & CHAR(34) & " site_id=" & CHAR(34) & SUBSTITUTE(SUBSTITUTE(A2,"0x",""),".","") & CHAR(34) & " xmltv_id=" & CHAR(34) & D2 & CHAR(34) & ">" & D2 & "</channel>"] and expand down to end of spreadsheet data
4. Filter column F 'Region' to suit your own region and then copy column G we created into 'WebGrab++.config.xml'
5. Set <timespan>8</timespan> in 'WebGrab++.config.xml' as there is up to 9 days of EPG data available, when Freeview updates the listing

NOTE: Timezone needs to remain in UTC as that is how Freeview now provides the dates, do not change!

NOTE2: In post #5, Blackbear199 has detailed how to use channels.xml as a solution as well

Hope you enjoy
 

Attachments: 
janKO
Offline
Donator
Joined: 10 years
Last seen: 4 hours

just ask: is possible to make xmltv file from excel with WG++?

i286
Offline
Joined: 7 years
Last seen: 5 years

Awesome work on the channel info output Blackbear199.

I havent had a chance to even consider how to do it plus it was a steep learning curve for WebGrab+Plus / regex already haha.
Only managed to finish the INI around 2am this morning :P
 

I am now wondering if can filter the channel creation via webgrab for certain region only?

I wont have time yet to investigate but here is hoping. I reckon may be able to with regex :) actually I have a fairly good idea how to in my head now. Will be similar to the actual channel scraper where it splits the index.

I wish it would not be required to convert to xml, but options are limited in that way i guess.
Another way if not wanting to convert would be to use Excel as my steps above and use this formula, without square brackets again.

I would stick the formula in new column on the right hand side of column F, in row 2, so it will be direct copy and paste. Ignore what I put in the original instructions if using this. Then expand it down to the end of the data, filter column F - Region to your column and copy the column into your 'WebGrab++.config.xml'

[="<channel update=" & CHAR(34) & "f" & CHAR(34) & " site=" & CHAR(34) & "freeview.com.au" & CHAR(34) & " site_id=" & CHAR(34) & SUBSTITUTE(SUBSTITUTE(A2,"0x",""),".","") & CHAR(34) & " xmltv_id=" & CHAR(34) & D2 & CHAR(34) & ">" & D2 & "</channel>"]

Cheers

Dennisdv
Offline
Has donated long time ago
Joined: 8 years
Last seen: 7 months

very usable - and this being a recently updated source and xml all the way (except for the channel-index obviously) it might survive a long time without the need for updates in the .ini.

Got two request: 

first one is to grab "programme program_type" and add it as a second Category. This will make sure that a MCE backend interprets the entry as a Movie or Series and treat it accordingly (apply color to the guide, show as Movie guide)

Second one is to grab "attributes" from the same line and put that in the appropriate XMLTV element. Again MCE will show that in its display guide and pass it on to Kodi showing it as well.

 

also - what is the story with line 53 - should that not be starred-out??

 

thnx

i286
Offline
Joined: 7 years
Last seen: 5 years
Dennisdv wrote:

very usable - and this being a recently updated source and xml all the way (except for the channel-index obviously) it might survive a long time without the need for updates in the .ini.

Got two request: 

first one is to grab "programme program_type" and add it as a second Category. This will make sure that a MCE backend interprets the entry as a Movie or Series and treat it accordingly (apply color to the guide, show as Movie guide)

Second one is to grab "attributes" from the same line and put that in the appropriate XMLTV element. Again MCE will show that in its display guide and pass it on to Kodi showing it as well.

 

also - what is the story with line 53 - should that not be starred-out??

 

thnx

You are right, those 2 attributes are missing. I will find their xmltv tags.

Both are easy to add and I will do it after work/dinner. Will base on Blackbear199 revision with the channels generator.

Since my file does not have 53 lines, presumably you are talking about Blackbear199 revision which has it. That line is in the channel file creation section and by default should be starred out.

Cheers

i286
Offline
Joined: 7 years
Last seen: 5 years
Blackbear199 wrote:

you could select regions in the channels.xml creation here,replace the sort with a substring.

*index_site_id.modify {sort(ascending,string)}
*sort_by.scrub {single(target="index_site_id")| -- |||}

index_site_id has elements in this format at this point..

[channel_id] -- [region] - [channel name]

i have to cobine everything into a single element as sort can only do a single element,if i didnt and sorted index_site_id and index_site_channel separately they would'nt match correctly.

so rather than sorting here just substring the elements you want by region name,for example say you wanted Adelaide only..

index_site_id.modify {substring(type=regex)|" ^[^-]*-- Adelaide - .*$"

one could also create a region.xml file so the region name can be read from the webgrab++config.xml and Adelaide in the above example could replaced by a element.

then finish with the existing  lines that separate the index_site_id and index_site_channel

index_site_channel.modify {substring(type=regex)|'index_site_id' " -- (.*)$"}
index_site_id.modify {substring(type=regex)|"^([^ -]*) --"}
end_scope

but honestly i think the channels.xml is perfectly useable as it is now thats its sorted by region name.

 

Thanks for the information. I will play with it in the future when I can as I am interested.
But as you say it is not absolutely necessary/priority as there is 2 very usable methods already.

Still tho, never hurts to learn something new...

Cheers

Dennisdv
Offline
Has donated long time ago
Joined: 8 years
Last seen: 7 months

sorry my mistake - the HD and CC elements do not exist in XMLTV DTD - they do exist in the MXF format MCE uses. Since I use BigScreenEPG for importing into MCE I'll have to do things there.

Program type comes across nicely - looks like ready to roll!!

one problem - when I open the XML file with guide with EPGEdit and go into detail for a movie I get:

************** Exception Text **************
System.ArgumentException: Column 'credits_Id' does not belong to table credits.
   at System.Data.DataRow.GetDataColumn(String columnName)
   at System.Data.DataRow.get_Item(String columnName)
   at EpgEdit.EpgEdit.getDetails(EpgProg p, Int32 pi, String s)
   at EpgEdit.EpgEdit.drawProgDetails(EpgProg p)
   at EpgEdit.EpgEdit.pbProgs_KeyDown(Object sender, KeyEventArgs e)
   at System.Windows.Forms.Control.OnKeyDown(KeyEventArgs e)
   at System.Windows.Forms.Control.ProcessKeyEventArgs(Message& m)
   at System.Windows.Forms.Control.ProcessKeyMessage(Message& m)
   at System.Windows.Forms.Control.WmKeyChar(Message& m)
   at System.Windows.Forms.Control.WndProc(Message& m)
   at System.Windows.Forms.Control.ControlNativeWindow.OnMessage(Message& m)
   at System.Windows.Forms.Control.ControlNativeWindow.WndProc(Message& m)
   at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)

doesn't cause a problem in the guide itself - but something must be going wrong??

I also have been getting - from the log file:

[  Debug ] xmltv input file - C:\ProgramData\ServerCare\WebGrab\guide.xml - found
[  Debug ] 1082 superfluous shows removed
[Error   ] Could find existing channel (xmltv_id=Sydney - ABC News 24) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - ABC1) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - ABC2 / ABC4) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - ABC ME) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - Nine) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - 9HD) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - 9GEM) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - 9GO) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - 9Life) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - 9Extra) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - SBS) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - SBS 2) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - SBS Food Network) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - NITV) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - SBS HD) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - Seven) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - 7mate HD Sydney) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - 7TWO) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - 7mate) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - 7flix) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - RACING.COM) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - ONE) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - TEN HD) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - TEN) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - TVSN) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - ELEVEN) in the config file
[Error   ] Could find existing channel (xmltv_id=Sydney - SpreeTV) in the config file
[  Info  ] 
[  Info  ] 
[  Info  ]       i=index  .=same  c=change  g=gab  r=replace  n=new

 

 

 

 

Log in or register to post comments

Brought to you by Jan van Straaten

Program Development - Jan van Straaten ------- Web design - Francis De Paemeleere
Supported by: servercare.nl