You are here

siol.net.ini

75 posts / 0 new
Last post
Ivan Grozni
Offline
Joined: 11 years
Last seen: 5 years
siol.net.ini

Can you update siol.net.ini, because don't work. Thank you

Ivan Grozni
Offline
Joined: 11 years
Last seen: 5 years
DVST8 wrote:

Ivan any feedback on this? Thanks

It works 
Thanks

Ivan Grozni
Offline
Joined: 11 years
Last seen: 5 years

This ini don't work anymore :( , can somebody make new update pls?

Ivan Grozni
Offline
Joined: 11 years
Last seen: 5 years

This ini don't work anymore :( , can somebody make new update pls?

illiac4
Offline
Joined: 10 years
Last seen: 7 years

Hi is it possible to add channel icons into this ini file.
Example:
http://cdn1.siol.tv/logo2/150x80/mezzo.png
ATM i am using a solution that is not very nice.
Example of what i am using ATM:

<channel update="i" site="siol.net" site_id="SLO+1" site_channel="http://bite-in.com/siol/logos_new/slo1.png" xmltv_id="SLO+1">SLO+1</channel>

    <channel update="i" site="siol.net" site_id="SLO+2" site_channel="http://bite-in.com/siol/logos_new/slo2.png" xmltv_id="SLO+2">SLO+2</channel>

    <channel update="i" site="siol.net" site_id="Planet+TV" site_channel="http://bite-in.com/siol/logos_new/planettv.png" xmltv_id="Planet+TV">Planet+TV</channel>

    <channel update="i" site="siol.net" site_id="POP+TV" site_channel="http://bite-in.com/siol/logos_new/poptv.png" xmltv_id="POP+TV">POP+TV</channel>

    <channel update="i" site="siol.net" site_id="Kanal+A" site_channel="http://bite-in.com/siol/logos_new/akanal.png" xmltv_id="Kanal+A">Kanal+A</channel>

Ivan Grozni
Offline
Joined: 11 years
Last seen: 5 years

This ini don't work anymore :( can somebody update pls

odem81
Offline
Joined: 9 years
Last seen: 8 years

Hi,

 

could someone take a look at this ini. It has stoped working.

i'm getting:

error downloading page: Error: NameResolutionFailure
pausing 3 of 4 times for 15 seconds before re-try.

 

 

 

Smacca
Offline
Joined: 11 years
Last seen: 3 years

I was not getting NameResolutionFailure error, but there was a problem with index_start.scrub. I have fixed it in this ini and it is working for me. Please note some channel names have changed also. See new channels.xml file.

 

 

odem81
Offline
Joined: 9 years
Last seen: 8 years

Hi,

 

can someone check why we don't get this text in xml: Otroški in mladinski / Risanka, Ostalo

2015-11-10_13-25-16.jpg

Page link is: http://www.siol.net/tv-spored.aspx?p2=jOm1MhyqCsMpFWUoxjw8qw%3d%3d

Could you also grab this text from the page?

In the webpage is like this: <p class="zanr" style="font-size: 12px;">Otroški in mladinski / Risanka, Ostalo <script type="text/javascript">raty_init();</script></p>

In ini file is: description.scrub {single|<p class="zanr">|<p>|</p>|<div class="clrA">}

Tnx.

 

odem81
Offline
Joined: 9 years
Last seen: 8 years

Anyone?

1NSdbZVbpZDX
Offline
Joined: 10 years
Last seen: 7 years

fixed the ini, added show icons, cleaned many errors, skips last show of day dunno why

it's partially working now

odem81
Offline
Joined: 9 years
Last seen: 8 years
1NSdbZVbpZDX wrote:

fixed the ini, added show icons, cleaned many errors, skips last show of day dunno why

it's partially working now

Thank you! It works great.

1NSdbZVbpZDX
Offline
Joined: 10 years
Last seen: 7 years

yeah but someone has to fix the bug 'skips last show of day' every day

that's beyond my WG knowledge, same is happening  with a brazilian site.ini  im triying to finish

francis
Offline
francis's picture
Has donated long time agoWG++ Team member
Joined: 12 years
Last seen: 1 week
Is the support helpful?
support us

And if you change index_showsplit into:

index_showsplit.scrub {multi|<div class="lst">|<span class="i3||</div></div></div>}
1NSdbZVbpZDX
Offline
Joined: 10 years
Last seen: 7 years

Francis has spoken!

upping final working tested ini

can be tranfered to EPG channels

odem81
Offline
Joined: 9 years
Last seen: 8 years

Do you know why is there so many links for the channel logo? It is not standard for xmltv to have more than one and tvheadend doesn't know how to parse this.

<display-name lang="sl">24Kitchen HD</display-name>
    <icon src="http://webgrabplus.com/%3Ca%20href%3D"http://cdn1.siol.tv/logo2/150x80/24kitchenhd.png">http://cdn1.siol.tv/logo2/150x80/24kitchenhd.png|http://cdn1.siol.tv/logo2/150x80/24kitchenhd.png|http://cdn1.siol.tv/logo2/150x80/24kitchenhd.png|http://cdn1.siol.tv/logo2/150x80/24kitchenhd.png" />
    <url>http://www.siol.net</url>

1NSdbZVbpZDX
Offline
Joined: 10 years
Last seen: 7 years

Fixed!

Attachments: 
odem81
Offline
Joined: 9 years
Last seen: 8 years

I can't see any change from your last two files.

1NSdbZVbpZDX
Offline
Joined: 10 years
Last seen: 7 years

hmm it works well here, see the log & guide from a test i performed, from the supposed buggy channel

odem81
Offline
Joined: 9 years
Last seen: 8 years

Strange,

my guide starts like this:

 

<?xml version="1.0" encoding="UTF-8"?>
<tv generator-info-name="WebGrab+Plus/w MDB &amp; REX Postprocess -- version 1.54.6/0.01 -- Jan van Straaten" generator-info-url="http://www.webgrabplus.com">
  <channel id="24Kitchen Adria">
    <display-name lang="sl">24Kitchen Adria</display-name>
    <icon src="http://webgrabplus.com/%3Ca%20href%3D"http://cdn1.siol.tv/logo2/150x80/doq.png">http://cdn1.siol.tv/logo2/150x80/doq.png|http://cdn1.siol.tv/logo2/150x80/doq.png|http://cdn1.siol.tv/logo2/150x80/doq.png|http://cdn1.siol.tv/logo2/150x80/doq.png" />
    <url>http://www.siol.net</url>
  </channel>
  <channel id="24Kitchen HD">
    <display-name lang="sl">24Kitchen HD</display-name>
    <icon src="http://webgrabplus.com/%3Ca%20href%3D"http://cdn1.siol.tv/logo2/150x80/24kitchenhd.png">http://cdn1.siol.tv/logo2/150x80/24kitchenhd.png|http://cdn1.siol.tv/logo2/150x80/24kitchenhd.png|http://cdn1.siol.tv/logo2/150x80/24kitchenhd.png|http://cdn1.siol.tv/logo2/150x80/24kitchenhd.png" />
    <url>http://www.siol.net</url>
  </channel>
  <channel id="Amc">
    <display-name lang="sl">Amc</display-name>
    <icon src="http://webgrabplus.com/%3Ca%20href%3D"http://cdn1.siol.tv/logo2/150x80/mgm.png">http://cdn1.siol.tv/logo2/150x80/mgm.png|http://cdn1.siol.tv/logo2/150x80/mgm.png|http://cdn1.siol.tv/logo2/150x80/mgm.png|http://cdn1.siol.tv/logo2/150x80/mgm.png" />
    <url>http://www.siol.net</url>
  </channel>
  <channel id="Animal HD">
    <display-name lang="sl">Animal HD</display-name>
    <icon src="http://webgrabplus.com/%3Ca%20href%3D"http://cdn1.siol.tv/logo2/150x80/animalhd.png">http://cdn1.siol.tv/logo2/150x80/animalhd.png|http://cdn1.siol.tv/logo2/150x80/animalhd.png|http://cdn1.siol.tv/logo2/150x80/animalhd.png|http://cdn1.siol.tv/logo2/150x80/animalhd.png" />
    <url>http://www.siol.net</url>
  </channel>

1NSdbZVbpZDX
Offline
Joined: 10 years
Last seen: 7 years

upload webgrablog.txt 

or you can get rid of problem by disabling the icon grab in the ini

just put a * at the beginning of line:

*showicon.scrub {single|<img alt="" |src="|" |/>}

u wont get icns though.....

odem81
Offline
Joined: 9 years
Last seen: 8 years

If i use your latest ini i just get title of the show in xml and nothing else.

If i use this ini http://webgrabplus.com/sites/default/files/download/ini/info/SiteIni.Pac...

The data is ok, just icons are multiple. I figured out the if i pull just for one day there is one icon. If i pull for let say 3 days, there are 3 the same icons. For every day it puts the same icon in the string, and they are seperated with | smiley

 

odem81
Offline
Joined: 9 years
Last seen: 8 years

I have just made some modifications to ini and now it is correct. Just one channel logo as it should be.

Can you check if anithing needs to be cahanged or else this ini can be moved to downloads.

Attachments: 
1NSdbZVbpZDX
Offline
Joined: 10 years
Last seen: 7 years

it's a misterious world....

illiac4
Offline
Joined: 10 years
Last seen: 7 years

Hi.

Is it possible to fix ini file for webgrab since http://tv-spored.siol.net/ is totally renewed and new files are needed to scarpe.

 

TNX

illiac4
Offline
Joined: 10 years
Last seen: 7 years

TNX. I see that it is down. Will test the script during weekend if it will be up again.

Miki
Offline
Joined: 10 years
Last seen: 8 years

It works pretty well. I wonder why the incremental update does not work. it shows everything as new even though it should be same. even if you run it 5 minutes after the initial grab.

illiac4
Offline
Joined: 10 years
Last seen: 7 years

Is someone else experiencing the opreration has timed out after 3 seconds?

illiac4
Offline
Joined: 10 years
Last seen: 7 years

It works for a minute or so and then it just says timed out. Channels are alive (EPG).

siki12
Offline
Joined: 8 years
Last seen: 5 months

Same thing here, time out on every channel.

 

channel (xmltv_id=TV SLO 2) site -- SIOL.NET -- mode incremental

iiiinnnnnnnnnnnnnnnnnnnnnnnnnnnerror downloading page: The operation has timed o

ut (5sec)

Retry 1 of 4 times

nnerror downloading page: The operation has timed out (5sec)

Retry 1 of 4 times

nerror downloading page: The operation has timed out (5sec)

Retry 1 of 4 times

nnnerror downloading page: The operation has timed out (5sec)

Retry 1 of 4 times

nnerror downloading page: The operation has timed out (5sec)

Retry 1 of 4 times

error downloading page: The operation has timed out (10sec)

Retry 2 of 4 times

nnnnnerror downloading page: The operation has timed out (5sec)

Retry 1 of 4 times

nerror downloading page: The operation has timed out (5sec)

Retry 1 of 4 times

nerror downloading page: The operation has timed out (5sec)

Retry 1 of 4 times

nerror downloading page: The operation has timed out (5sec)

Retry 1 of 4 times

error downloading page: The operation has timed out (10sec)

Retry 2 of 4 times

Miki
Offline
Joined: 10 years
Last seen: 8 years

Maybe they block your ip. Does it open in a browser?

siki12
Offline
Joined: 8 years
Last seen: 5 months

In the afternoon works better I get time out issue only on some channels, but the EPG downloads anyway. This time out issue appears on channels randomly.

I have VDSL connection 10/4. In web browser works flawlessly. I also have clear line. So internet connection is not an Issue.

Maybe there is an issue on Siol side?

illiac4
Offline
Joined: 10 years
Last seen: 7 years

No i am not blacklisted. There must be something else. During the parsing time i can access the epg through browser. And also the incremental does not work it parses always from beginning.

Miki
Offline
Joined: 10 years
Last seen: 8 years

I am also blowing my mind on incremental. it should work normaly. The time structure is the same. Maybe Franciss or Jan could take a look at it.

poldehudoklin
Offline
Joined: 8 years
Last seen: 8 years

Hi!

Thanks for the new version of parser.

It works, but it seems, SIOL is throttling access to the page.

I have temporarily solved the problem by increasing retry time-out in ProgramData\ServerCare\WebGrab\WebGrab++.config.xml

from retry time-out="5" to retry time-out="15"

I have also reduced number of channels to grab and days to grab to minimum.

 

Miki
Offline
Joined: 10 years
Last seen: 8 years

I rewrote the ini. now it takes EPG from index pages. Its much faster and incremental works. You can still enable scrubing the inside pages if you want actors and directors... Episode is calculated for my personal preference, racalculate if using other systems it should be -1.

 

here's the ini:

 

site {url=siol.net|timezone=UTC+01:00|maxdays=4|cultureinfo=sl-SI|charset=UTF-8|titlematchfactor=90|ratingsystem=IMDB}

url_index{url|http://tv-spored.siol.net/kanal/|channel|/datum/|urldate|}

url_index.headers {customheader=Accept-Encoding=gzip,deflate}

urldate.format {datestring|yyyyMMdd}

*

index_showsplit.scrub {multi|<main role="main" class="table-list">|<div class="row" data-id=||</main>}

*

index_title.scrub {single|<div class="col-9">|<strong>|</strong>|</div>}

index_category.scrub {single|<div class="col-2 right">|<small class="gray">|</small>|</div>} *index page category

index_category.scrub {multi(separator="," include=first)|<p class="event-meta">||</p>|</p>}

index_category.modify {remove(type=regex)|".*\/"}

index_start.scrub {single(debug)|<div class="col-1">||</div>|</div>}

index_description.scrub {single|<div class="col-9">|<p>|</p>|</div>}

*index_description.modify {addend|\n}

index_rating.scrub {single|<i class="fa fa-clock-o"></i>|IMDB:|!??!|<span>}

index_showicon.scrub {single|<div class="col-3">|<img data-src="|"| title}

index_country.scrub {single(separator="," include=last)|<p class="event-meta">||<br>|<i class="fa fa-clock-o">}

 

index_temp_8.scrub {single(separator="," include="sezona")|<p class="event-meta">||</p>|</p>}

index_episode.scrub {single(separator="," include="del")|<p class="event-meta">||</p>|</p>}

 

 

 

************************* Uncoment for more detailed info (much slower, incremental does not work)

*index_urlshow {url|http://tv-spored.siol.net|<p><a href="http://webgrabplus.com/%7C%7C%7C"}

*index_urlshow.headers {customheader=Accept-Encoding=gzip,deflate}

***

*title.scrub {single|<article role="article">|<h1>|</h1>|<p class}

*start.scrub {regex||<div class="time">[^>]*(\d{2}:\d{2})[^>]*-[^>]*\d{2}:\d{2}[^>]*</div>||}

*description.scrub {multi(include=2)|<p class="content">||</p>|</p>}

*director.scrub {single(separator="," include=first2)|Režija: </b>||</p>|</p>}

*actor.scrub {single(separator="," include=first5)|Igrajo: </b>||</p>|</p>}

*

 

 

 

***********************

 

  index_temp_8.modify {remove(not "")| sezona }

*temp_8.modify {addstart(null)|1}

  index_temp_8.modify {calculate(format=F0)|}

*temp_8.modify {calculate(format=F0)|1 -}

index_episode.modify {remove(not "")| del}

*episode.modify {addstart(null)|1}

  index_episode.modify {calculate(format=F0)|}

*episode.modify {calculate(format=F0)|1 -}

index_episode.modify {addstart|'index_temp_8'. }

index_episode.modify {addend|. 0/0}

index_episode.modify {remove(not "")|0. 0 .0/0}

*

country.modify {replace(null)|Združene države Amerike|ZDA}

illiac4
Offline
Joined: 10 years
Last seen: 7 years

TNX for this one. If you will update the scarper can you upload it as ini file or ose pastebin to paste the text file.

There are still some bugs presented. At the end i see  (n) on every scarp.

Also "preberi več" could be filtered out.

 

And also scarping for example CBS Reality:

This is xml scarped.

<programme start="20160408103500 +0200" stop="20160408110000 +0200" channel="CBS Reality">
<title lang="sl">Preživeli za las na posnetkih</title>
<desc lang="sl">
<a href="http://webgrabplus.com/kanal/reality/oddaja/2203427660/datum/20160408">» preberi več</a>.(n)
</desc>
<category lang="sl">Dokumentarni</category>
<category lang="sl">Ostalo</category>
<icon src="http://webgrabplus.com/%3Ca%20href%3D"http://vimg.siol.tv/sioltv/epg/default/documentaire.png">http://vimg.siol.tv/sioltv/epg/default/documentaire.png"/>

This is what is showed on the siol site:

10:35Preživeli za las na posnetkih
Dokumentarni
PREŽIVELI ZA LAS NA POSNETKIH
Dokumentarni / Ostalo, 2. sezona, 9. del, Ostalo
None
» preberi več

Miki
Offline
Joined: 10 years
Last seen: 8 years

I know. Its not finished. I was just so happy that incremental work that i had to paste it in the forums. I am cleaning the code out an will atach ini when its finished. 

Update: Ini ready for testing. Incremental working. Feedback welcome.

Attachments: 
illiac4
Offline
Joined: 10 years
Last seen: 7 years

I am testing it right now. It is much slower because it scarps deeper into EPG. I assume if i comment out this part ************************* detailed page scrub it will read just from index page.

Maybe a good idea to make it optional would also be to allow thumbnail link scarp or. not since it takes some space in xml.

Also from the xml file (not yet imported into backend) I can still see this (n) at the end of each scarp. Will investigate once it is finished.

Miki
Offline
Joined: 10 years
Last seen: 8 years

Still working on it and yes if you delete the part beyond the index it will be faster. As fo (n), you have to disable it in config.

n = nomark    disables the update-type marking (n) (c) (g) (r) at the end of the description

Update:

Final version for my personal taste. Will modify it if i find any bugs. Dont forget to increase index-delay="xx", because of timeot issue with siol. Enjoy.

Attachments: 
odem81
Offline
Joined: 9 years
Last seen: 8 years

Hi,

when using this latest ini the channel icons are missing. Can you please fix this.

odem81
Offline
Joined: 9 years
Last seen: 8 years

Tnx, it works!

illiac4
Offline
Joined: 10 years
Last seen: 7 years

It looks like there are changeses on the site again since the scarper does not work anymore. At quick look the bold selection is new.

http://tv-spored.siol.net/kanal/3sat/oddaja/23189932123/datum/20160616

 

and i assume that this line has to be changed to reflect the changes:

url_index{url|http://tv-spored.siol.net/kanal/|channel|/datum/|urldate|}

 

 

simon
Offline
Donator
Joined: 8 years
Last seen: 5 months

Hi! Good work  and it realy works perfect for one week. Today I notice a small problem again? Am I the only one? If it is possible to fix it I will be very grateful...

Damjanc
Offline
Joined: 8 years
Last seen: 6 years

Thank you Blackbear199..working OK.

illiac4
Offline
Joined: 10 years
Last seen: 7 years

Is parsing very slow also for others. It was much faster in previous version in the latest is totally slow. 15 hours or. more for 2 days?

illiac4
Offline
Joined: 10 years
Last seen: 7 years

http://pastebin.com/idJr92r7

This time it has finished in a little more then two hours. But it would take much more, because it has made only c (change) and almost no n (new) entries because i have run it right after finish to test it.

Damjanc
Offline
Joined: 8 years
Last seen: 6 years

Hi,

I think that they changed once again something. Webgrab says "no shows in indexpage". Could someone be so kind and try to update ini files.

Thank you

illiac4
Offline
Joined: 10 years
Last seen: 7 years

it seems that it is not parser but there is no epg on their site. so wait till they fix it.

 

tarzUG
Offline
Joined: 8 years
Last seen: 8 years

The ini file is not ok as it is, it does not work for me too, EPG is on the site and I can look at all of the programs. Seems they changed something again, if some knowlegable person can fix it, we all will be very happy :)

 

 

Damjanc
Offline
Joined: 8 years
Last seen: 6 years

Hi,

Thank you very much for help. Everything is working OK now.

 

Pages

Log in or register to post comments

Brought to you by Jan van Straaten

Program Development - Jan van Straaten ------- Web design - Francis De Paemeleere
Supported by: servercare.nl