You are here

Get Channel Logo - horizon.tv.ch

15 posts / 0 new
Last post
koori
Offline
Joined: 4 years
Last seen: 4 years
Get Channel Logo - horizon.tv.ch

Get Channel Logo - horizon.tv.ch
Hello, I figured out that this ini file of horizon.tv.ch.ini is not grabbing the channel icons.

I just tried the last hour with the following entry but unfortunally it's not working. Maybe you know what I am doing wrong here?

Page I tried to grab:

https://www.horizon.tv/de_ch/tv-schauen/live-channel.html/256616999367/1...
My Ini Entry:

Quote:

index_urlchannellogo.scrub { url () ||<div class="box-art logo boxart-small">| src="|"|alt=|</div>}

The html part of the homepage I want to grab:

Quote:

<div class="description">
<div class="box-art logo boxart-small">
<img src="https://wp9-images-ch-dynamic.horizon.tv/channellogos/02/itv_3.png?w=75&... alt="ITV 3">
</div>

If this is finally working it might be uploaded as well to the ini epg list :)

koori
Offline
Joined: 4 years
Last seen: 4 years

Hello thanks for your hints, unfortunaly it's not working, the output is the following.

  <channel id="ITV 3 UK">
    <display-name lang="de">ITV 3 UK</display-name>
    <url>http://www.horizon.tv</url>
  </channel>

koori
Offline
Joined: 4 years
Last seen: 4 years

The ITV 3 Logo that is part of the html document.

koori
Offline
Joined: 4 years
Last seen: 4 years
Blackbear199 wrote:

ok,i looked at page and this should get channel icon.itv 3 example

index_urlchannellogo.scrub {url|<div class="box-art logo boxart-small">|src="|">|</div>|<h2 }

Thank's for your help and patience, I have added the debug attribute to url and got following output.

[  Info  ] channel (xmltv_id=ITV 3 UK) site -- HORIZON.TV.CH -- mode full
[  Debug ] Debugging information SiteIni
[  Debug ] Element:  INDEX_URLCHANNELLOGO
[  Debug ] html source written to : C:\ProgramData\ServerCare\WebGrab\html.source.htm
[  Debug ] scrub strings:
[  Debug ]      type & arguments : url(debug)
[  Debug ]      headstring       : <div class="box-art logo boxart-small">
[  Debug ]      blockstart   (bs): src="http://webgrabplus.com/_%5B%20%C2%A0Debug%20%5D%20%C2%A0%20%C2%A0%20%C2%A0elementstart%20%28es%29%3A%20">
[  Debug ]      elementend   (ee): </div>
[  Debug ]      blockend     (be): <h2 
[  Debug ] 
[  Debug ] No Block with these separators

And the source output for one ID: There is not the url of itv_3 logo.

{"id":"16807026","title":"KRIMI","scheme":"urn:tva:metadata:cs:UPCEventGenreCS:2009"}
],"isAdult":false,"cast":["Kevin Whately","Laurence Fox","Clare Holman","Rebecca Front","Owen Teale","Tom Harper","Gina McKee"],"directors":["Sarah Harding"],"images":[{"assetType":"boxart-xlarge","assetTypes":["boxart-xlarge"],"width":210,"height":303,"url":"https://wp21-images-ch-dynamic.horizon.tv/linear_images/20731373532.p.jp..."}
,{"assetType":"boxart-small","assetTypes":["boxart-small"],"width":75,"height":108,"url":"https://wp21-images-ch-dynamic.horizon.tv/linear_images/20731373532.p.jp..."}
,{"assetType":"boxart-medium","assetTypes":["boxart-medium"],"width":110,"height":159,"url":"https://wp21-images-ch-dynamic.horizon.tv/linear_images/20731373532.p.jp..."}
,{"assetType":"boxart-large","assetTypes":["boxart-large"],"width":180,"height":260,"url":"https://wp21-images-ch-dynamic.horizon.tv/linear_images/20731373532.p.jp..."}
,{"assetType":"tva-boxcover","assetTypes":["tva-boxcover"],"width":180,"height":260,"url":"https://wp21-images-ch-dynamic.horizon.tv/linear_images/20731373532.p.jpg"}
],"mediaGroupId":"crid:~~2F~~2Feventis.nl~~2F00000000-0000-1000-0008-00000000899B","secondaryTitle":"Old School Ties","shortDescription":"When an ambitious Oxford student is found dead in her hotel room after inviting a reformed computer hacker to speak at the Union, Lewis and Hathaway investigate. IMDb rating: 7.6/10.","mediaType":"Episode","year":"2007","isReplayTv":false,"seriesEpisodeNumber":"2","seriesNumber":"1","videoStreams":[],"airDate":1167609600000,"entitlements":["VIP","_OPEN_"],"currentProductIds":[],"currentTvodProductIds":[]}
}

koori
Offline
Joined: 4 years
Last seen: 4 years
Blackbear199 wrote:

my bad,if you look at 4.4.3 in documentation i dont think we can use scrub  url with this so we need do diff way.

index_temp_1.scrub {single|<div class="box-art logo boxart-small">|src="|">|</div>}

index_urlchannellogo {url||'index_temp_1'}

or i think this also

index_urlchannellogo.scrub {single|<div class="box-art logo boxart-small">|src="|">|</div>}

This seems to be harder than expected :) Still the same, I assume the url is not correct that I have posted in the beginning. So I have uploaded the Ini and config file with that channel as example. Maybe it's helping you a bit more.

 

koori
Offline
Joined: 4 years
Last seen: 4 years

Nice try, as I got you so far you try to seperate the url of this part or?

<a href="http://webgrabplus.com/de_ch/tv-schauen/live-channel.html/256616999367" class="channel-link">
 <span class="ooh-indicator"></span>
  <img src="https://wp9-images-ch-dynamic.horizon.tv/channellogos/02/itv_3.png?w=75&amp;h=108&amp;mode=box" alt="ITV 3">
</a> 
koori
Offline
Joined: 4 years
Last seen: 4 years

Hi thank you very, much I created an output of the seperated url to the image. I assume that ?w=110&h=150&mode=box needs to be removed or?

[  Debug ] ----------begin--block----------
[  Debug ]  content="https://wp9-images-ch-dynamic.horizon.tv/channellogos/02/itv_3.png?w=110..."/>
[  Debug ] ----------end----block----------

 

 

koori
Offline
Joined: 4 years
Last seen: 4 years

Thanks a lot, even with the proper URL I don't get the channel icon to the XML. The goal is not far away :)

There is something that's confusing pretty much, when I just change for testing the value index_urlchannellogo to index_urlshow I get exact the channel Icon but it's part of the movie description. But index_urlchannellogo seems not to work.

  <channel id="ITV 3 UK">
    <display-name lang="de">ITV 3 UK</display-name>
    <url>http://www.horizon.tv.ch</url> --> Missing: <icon>Bla</icon>
  </channel>

[  Debug ] Modify
[  Debug ]      command & arguments : set(debug)
[  Debug ]      Expression-1            : 'temp_4'
[  Debug ]      Element value before operation:
[  Debug ] https://wp9-images-ch-dynamic.horizon.tv/channellogos/02/itv_3.png
[  Debug ]      String composer result for Expression-1 :
[  Debug ]      Expression-1 expanded   : https://wp9-images-ch-dynamic.horizon.tv/channellogos/02/itv_3.png
[  Debug ]      Element value after operation:
[  Debug ] https://wp9-images-ch-dynamic.horizon.tv/channellogos/02/itv_3.png
[  Debug ]    skipped 

francis
Offline
francis's picture
WG++ Team memberDonator
Joined: 8 years
Last seen: 1 month
Is the support helpful?
support us

Well and here the explanation why it confuses you. And I think I've got bad news. I don't think you will get the channel icon in the current siteini implementation.

The channel logo url is only grabbed on the index page (see 4.6.1.1 in de docs), during the datelogo scope (before the showsplit!)

And the thing you are trying to do, is to grab the channel logo from a show page.

If you add debug to the showsplit and check out the html.source.htm file, you will see the data were wg++ can search for the channel logo. And currently I don't see any connection point with that data and the channel logo. So that's the reason why I think in the current siteini it is not possible to support the channel logo url.

koori
Offline
Joined: 4 years
Last seen: 4 years

Hi Francis, thank you very much for your explanation here. At least we know something more, this might be a new feature request :) 

francis
Offline
francis's picture
WG++ Team memberDonator
Joined: 8 years
Last seen: 1 month
Is the support helpful?
support us

Made a temp workaround. The channel logo url is available in the .channels.xml generation part. So I added the info with the site_id value.

So now there is a new .channels.xml file you must use with this siteini.

Yes, could be a feature request. But currently many other things on our mind. Currently re-writing a part of the configuration part in the code. And the new installer is also still on our list. But time .....

koori
Offline
Joined: 4 years
Last seen: 4 years

Thank you Francis 

this output I get is, but I will try to extract the image from your given xml. I get that there are many other things in your pipe.

Just a question beside will be there something parallelization when grabbing? I don't know which language webgrab is based on but in .Net it's kinda simple to implement those features.

  <channel id="ITV 2 UK">
    <display-name lang="de">ITV 2 UK</display-name>
    <icon src="http://webgrabplus.com/27909159384" />
    <url>http://www.horizon.tv</url>
  </channel>

 

francis
Offline
francis's picture
WG++ Team memberDonator
Joined: 8 years
Last seen: 1 month
Is the support helpful?
support us

Ok, here an adjusted siteini. (previous also works correct, but this one won't output a channels logo, if you use an incorrect channel).

The main reason that you don't see it works, is because you have used the old channel definitions in your config file.

So delete all the horizon.tv.ch channels in your config file, open my new .channels.xml file (#20 posts above) and use these new etries.

Attachments: 
francis
Offline
francis's picture
WG++ Team memberDonator
Joined: 8 years
Last seen: 1 month
Is the support helpful?
support us

About paralleling things, no nothing is done that way. In the early days of wg++ is also ask to introduce such a thing. Because I wanted to speed up things.

But as I now look back, with the knowledge and experience I now have, its not a top priority any more.

If you even seen in some of the settings of wg++, there are even "slowdown" settings, to be sure the site is not blocking you because you are loading there server to much. So running multiple grabs in multiple threads would not help and would create strange effects.But nevertheless it is still in the back of my mind to implement such a thing. And also most of the siteini grabbing stuff, the webpage download is the bottle neck (slowest) part. And putting multiple of those requests in separate threads, won't help a lot.

And if it were that simple, it was already be implemented. But hopefully one day, we can do this.

koori
Offline
Joined: 4 years
Last seen: 4 years

Your explanation changed my view on this topic regarding to the parallel optimisation. The only point I see would be to run several thread for different pages to grab. 

Btw thank you very much for all your effort and your files are working pretty well :)

Log in or register to post comments

Brought to you by Jan van Straaten

Program Development - Jan van Straaten ------- Web design - Francis De Paemeleere
Supported by: servercare.nl