Hi,
I'm attempting to create a ini file for tivu.tv for the very first time, but the website works slightly different than the usual and I find hard to understand few things from the documentation.
First of all the website doesn't have a channel page for the entire week but just for one day. The entire week for all channels is on the index page without a day separation but all in one stream. The channels are separated from each other by a <div id= section. Inside this section all the programs are separated by another <div id= block with a unique id.
Now, reading the documentation I did understand that the shows are scrubbed with the index_showsplit function, then the xmltv data are parsed from the data get with this operation locally (correct me if wrong).
I want to take, with the separator string method, the shows using the index_showsplit but how can I pass the 'channel' argument to fill the <div id="" section??
I tried few ways but with no luck, ex includeblock. In the doc is always implied that the shows have a separate page and the arguments to get them are passed with the url_index page.
Thank you for your patience.
fasigno
A quick starter.
You are lucky. The site has alle the index data on one page.
So on in the site part you should mention:
maxdays=6.1 ==> meaning 6 days of show data (maybe it is more) and all info is on the first page
keepindexpage ==> when grabbing multiple channels for this site, no need to download the index page again, because all channels are mentioned on the same page.
The showsplit looks like:
index_showsplit.scrub {multi(debug type=regex)||<div class=\"q[^>]*id=\'28\'>(<div.*?</div>)*</div>||}
This is only valid for the channel with id=28!
The build up in the html of the show data is:
for channel with id 28
<div class="q" id='28'><div ... showdata1...</div><div ... showdata2...</div><div ... showdata3...</div><div ... showdata4...</div></div>
for channel with id 29
<div class="q" id='29'><div ... showdata1...</div><div ... showdata2...</div><div ... showdata3...</div><div ... showdata4...</div></div>
So I got:
site {url=tivu.tv|timezone=UTC+00:00|maxdays=6.1|cultureinfo=en-GB|charset=UTF-8|titlematchfactor=90|keepindexpage}
url_index{url|http://www.tivu.tv}
url_index.headers {customheader=Accept-Encoding=gzip,deflate} * to speedup the downloading of the index pages
index_showsplit.scrub {multi(debug type=regex)||<div class=\"q[^>]*id=\'28\'>(<div.*?</div>)*</div>||}
index_start.scrub {single(type=regex)||>(\d\d:\d\d)<||}
index_title.scrub {single|<p>||<br>|<br>}
And I've got all index data.
Hope this will help you.
So just adding the channel swith in the showsplit part and you can use it.
+ adding detail page stuff.
When you have it working, it would be nice to add it to our EPG channels page. (Let us know)
Thank you for your answer and your advices Francis.
But my problem still persist because I don't get from the documentation how can I use the token "channel" to change the id number.
Infact I cannot force it but it has to change according to the Webgrab++.config file where the user chooses the channels.
For example I can use it in the url_index:
url_index{url(preload="http://www.mobistar.tv/tv-guide.aspx")|http://www.mobistar.tv/epg.aspx?f_format=pgn&medium=0&lng=nl&f=|urldate|&t=xxxxx&s=|channel|,0,2,&_=|urldate|}
But not in the index_showsplit where '|' separates the blocks.
Maybe is there any trick with the index_variable_element?
fasigno
Correct.
index_variable_element is the way to go.
Just add
index_variable_element.modify{addstart|'config_site_id'}
index_showsplit.scrub {multi(debug type=regex)||<div class=\"q[^>]*id=\''index_variable_element'\'>(<div.*?</div>)*</div>||}
and off you go.
[If you want to learn fast, just download the siteini pack and search all of the siteini file. Examples enough ;-) ]
I created this. It's functional but far far away from perfection. I put your name into it too.
I disable few fields scraping because the description of the shows hasn't a regular pattern and I have to master regex a lot more.
I'll continue to maintain it.
Thank you for your help.
fasigno
Anyone willing to geti it to work ?
Hello,
I use it on a daily basis since that time and it works well! But I'm stuck with an older version of webgrab..
tivu.tv is an official tv guide for the italian television, consider including it in the italian section if it's still functional.
@fasigno
Could you upload the siteini you are currently using? Just to be sure I get the correct one.
And what is the reason that you are stuck on an older WG++ revision (which revision is that?)
Thanks!
Sorry for my late answer, this is my ini file. The reason why i'm stuck with an old version is just because I update my mediacenter only if there are stability issues. Webgrab's log says 1.54.6/0.01 version.
fasigno
Thanks your site works (with ini non in siteini.user), need help to get an update channel list.
Thank you so much !! It works like a charm, great job !!
Hello. Just found this page and tried grabbing tivu.tv programs using the supplied ini and xml files but it doesn't work for me. I replaced every occurrence of http with https in the ini file but with no success.
Any help?
Thank you
Meanwhile I found a solution. This ini file works quite well: