You are here

Canalsat.fr no longer works.

21 posts / 0 new
Last post
eratox
Offline
Donator
Joined: 8 months
Last seen: 1 day
Canalsat.fr no longer works.

Hello,

I am a user of EPG /France/CanalSat.fr.
Unfortunately, it has not worked for a few days.

If someone can get the EPG back on the road, it would be greatly appreciated.

Eratox.

Attachments: 
poloche63
Offline
Joined: 1 month
Last seen: 4 days

Hi,

+1 for this request ! Canalsat is the most important epg on France. On Canalsat, there are lot of exclusives channels like CLique TV, Viceland, Olympia, etc...

If a man can help us to use again this grabber, it's very friendly.

New url request is on JSON site.

Ex : Channel 601 (Canal+) for day = 0 : http://hodor.canalplus.pro/api/v1/mycanal/gridTVContent/1f386e59bd499c29...

"url_index{url|http://hodor.canalplus.pro/api/v1/mycanal/gridTVContent/1f386e59bd499c29...|channel|.json?cache=360000&params[day]=|urldate|&imagesForPrimeOnly=tru}"

I can extract all channels id and name to the new channels.xml, but i don't understand how to edit the ini file with json site... :s

Please help us to find a way again.

Thanks to the community of grabbers :)

Attachments: 
bellicheone
Offline
Joined: 3 weeks
Last seen: 4 days

+1 EPG /France/CanalSat.fr not worked for a few days.

I Find other URL with used site_id from current settings in file "canalsat.fr.channels.xml"

Example For CANAL+ (site_id="301")
https://hodor.canalplus.pro/api/v1/mycanal/channels/96119d61cb9cb943ac65...

Modify for canalsat.fr.ini
https://hodor.canalplus.pro/api/v1/mycanal/channels/96119d61cb9cb943ac65...|channel|/broadcasts/day/|urldate|

The result is A json as URL of poloche63.

I hope this part "96119d61cb9cb943ac658699affb2314" of the url is not dynamic

I noticed that the timestamp is set to the time zone +00. it seems to add 1 hour.

I will look at what it is possible to do based on the file "program-television.org.ini" which also parse a JSON.

I'm new to the community, a little help would be my pleasure

Sorry for my english :)

mat8861
Offline
WG++ Team memberDonator
Joined: 4 years
Last seen: 11 hours

Is a bit more complicated.. easy to see but more difficult to get. Siteini needs completly rewrite and will look into it time permitting. Time zone is UTC so is correct will be changed to correct time by your device
ps
Welcome !

jamesm147
Offline
Donator
Joined: 9 months
Last seen: 3 weeks

Thank you Matt! :)

bellicheone
Offline
Joined: 3 weeks
Last seen: 4 days

I work on file "canalsat.fr.ini"

I get data "title" and "subtitle" since new URL

I read the documentation v2.2 but I meet 2 problems:
1 - the date time data on XML file output isn't correct, the time is always 00:00 but the date is correct
2 - How to use a secondary URL for get another datas, I need for get "description" , "actors", "producers" , etc...

Who can help on questions ?

Thank you in advance for your contributions

here is example XML output
**********************************************************************************

CANAL+
http://www.canalplus.com

L'hebd'Hollywood
Du 15 févr

Les p'tits diables - S3 - Ép 20
Série Animation

**********************************************************************************

here is my file "canalsat.fr.ini"

**********************************************************************************
site {url=canalplus.com|timezone=Europe/Paris|maxdays=11|cultureinfo=fr-FR|charset=utf-8|titlematchfactor=10}
site {ratingsystem=CSA|episodesystem=onscreen|nopageoverlaps|allowlastpageoverflow}
*
url_index {url|https://hodor.canalplus.pro/api/v1/mycanal/channels/96119d61cb9cb943ac65...|channel|/broadcasts/day/|urldate|}
url_index.headers {accept=text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9}
url_index.headers {accept-encoding=gzip,deflate,br}
url_index.headers {origin=https://www.canalplus.com}
url_index.headers {referer=https://www.canalplus.com/programme-tv/}
*
urldate.format {daycounter|0}
*
index_showsplit.scrub {multi|{"timeSlices":|{"contentID||]}});}
index_showsplit.modify {cleanup(style=unidecode)}
index_showsplit.modify {cleanup(style=jsondecode)}
*
index_title.scrub {single|"title":"||",}
index_subtitle.scrub {single|"subtitle":"||",}

**WIP HERE
index_start.scrub {single|"startTime":"||",}
*index_start.modify {calculate(format=date,yyyy/MM/dd HH:mm)} *convert UNIX date to yyyy/MM/dd HH:mm
*index_start.modify {calculate(format=time,HH:mm)}
*index_start.modify {calculate(format=utcdate)}
*index_start.modify {calculate(format=date,unix)}
**
**
index_temp_1.scrub {single|"URLPage":"||"}
index_temp_2{url(debug)|'index_temp_1')
description.scrub {single(debug)|"summary":"||",}
*index_temp_2.scrub {multi(debug)|{"prefix": "De :"|{"||]}});}
*
**index_episode.scrub {regex(pattern="nbEpi:'Et1',numEpi:'E1',saison:'S1'")||","nbEpi":"\d+","numEpi":"\d+","saison":"\d+"||}
**index_episode.modify {remove|"}
**index_category.scrub{single|ture":"||",}
**index_actor.scrub {single(separator="," max=3)|"Acteur":["||"],}
**index_actor.modify {remove|"}
**index_producer.scrub {single(separator="","" max=2)|"Réalisateur|":["|"|],"}
**index_producer.modify {replace|","|\|}
**index_presenter.scrub {single|{"Présentateur vedette":|["|"|]|,}
**index_composer.scrub {single(max=2)|"Musique":["||"],}
**index_composer.modify {replace|","|\|}
**index_urlchannellogo.modify {addstart|http://www.programme-television.org/logo_channels/35x35/chaine_'config_site_id'.png}
***
**index_urlshow {url|http://www.programme-television.org|"urlDiff|":"|"|,}
**index_urlshow.headers {customheader=Accept-Encoding=gzip,deflate} * to speedup the downloading of the detail pages
**index_temp_1.modify {set|'index_urlshow'}
**index_temp_1.modify {remove(type=regex)|"(^.*?#).*?$"}
***
**country.scrub {single|Pays de production :|||}
**title.scrub {single||||}
**title.modify {cleanup(tags="<"">")}
**subtitle.scrub {single()|

  • ||
  • }
    **description.scrub {single||||}
    **description.modify {cleanup(tags="<"">")}
    ***
    **index_urlsubdetail.headers {customheader=Accept-Encoding=gzip,deflate} * to speedup the downloading of the subdetail pages
    **index_urlsubdetail.modify {set|'index_temp_1'}
    **index_urlsubdetail.modify {addstart()|http://www.programme-television.org/getinfos/}
    ***
    **subdetail_title.scrub{single||||<\/h1>}
    **subdetail_title.modify {cleanup(tags="<"">")}
    **subdetail_title.modify {cleanup(style=jsondecode)}
    **subdetail_description.scrub {single|||<\/div>|<\/div>} *required by movies
    **subdetail_description.modify {cleanup(style=jsondecode)}
    **subdetail_description.modify {cleanup(tags="<"">")}
    **subdetail_productiondate.scrub {single|

  • <\/li>||<\/li>}
    **country.scrub {single|Pays de production :<\/strong>||<\/li>}
    **********************************************************************************
  • mat8861
    Offline
    WG++ Team memberDonator
    Joined: 4 years
    Last seen: 11 hours

    index_urlshow {url||"URLPage":"||"|"}
    index_urlshow.headers {customheader=Accept-Encoding=gzip,deflate,br}

    bellicheone
    Offline
    Joined: 3 weeks
    Last seen: 4 days

    hello community

    I have a good news

    We find in attachement the new file "canalsat.fr.ini", no need to modify file "canalsat.fr.channels.xml"

    good reception

    thank you mat8861 four you help

    Bellicheone

    Attachments: 
    poloche63
    Offline
    Joined: 1 month
    Last seen: 4 days
    bellicheone wrote:

    hello community
    I have a good news
    We find in attachement the new file "canalsat.fr.ini", no need to modify file "canalsat.fr.channels.xml"
    good reception
    thank you mat8861 four you help
    Bellicheone

    Very good news my friends !!!
    I'm testing it now ^^

    Thanks a lot

    :D

    PS : after grab epg on Channel 813 (Olympia), we have "(?)" at the end of all shows title. Please, it's possible to edit the ini file and check why ?

    Thankx bro

    bellicheone
    Offline
    Joined: 3 weeks
    Last seen: 4 days

    I had the same problem with the previous ini file

    You have 2 solutions
    1 - Modify ini file Add "index_title.modify {replace| (?)|}" After "index_title.scrub"
    = index_title.scrub {single|"title":"||",}
    + index_title.modify {replace| (?)|}
    = index_subtitle.scrub {single|"subtitle":"||",}

    2 - Create bash file sh linux with this code
    #!/bin/bash
    fileGuide="./guide.xml"
    if ! [ -z $1 ]; then
    echo "Use File Path From ARG : $1"
    fileGuide=$1
    fi
    sed -i 's/ (?)//g' $fileGuide

    exit 0;
    ############################################################

    Try this and give me the result

    mat8861
    Offline
    WG++ Team memberDonator
    Joined: 4 years
    Last seen: 11 hours
    poloche63 wrote:

    bellicheone wrote:
    hello community
    I have a good news
    We find in attachement the new file "canalsat.fr.ini", no need to modify file "canalsat.fr.channels.xml"
    good reception
    thank you mat8861 four you help
    Bellicheone

    Very good news my friends !!!
    I'm testing it now ^^
    Thanks a lot
    :D
    PS : after grab epg on Channel 813 (Olympia), we have "(?)" at the end of all shows title. Please, it's possible to edit the ini file and check why ?
    Thankx bro

    That means index_title do not match details title. Your scrub seems correct, details do not have episode in title...you can decrease titlematch factor or add: title.modify{addstart|'index_title'}, this way will always have the index_title and no mismatch.
    Just checked few channels also some episodes are wrong.

    Attachments: 
    bellicheone
    Offline
    Joined: 3 weeks
    Last seen: 4 days

    hello community

    I continous to working on ini file

    You find in attachement the new file "canalsat.fr.ini" corriged
    - the remove "(?)" at the end of all shows title
    - the episode information is now correct

    For file "canalsat.fr.channels.xml",I've also check url channel for 1 to 1000.
    some channel are no longer available and new channel are available
    You find new file "canalsat.fr.channels.xml" in attachement

    Good test :)

    poloche63
    Offline
    Joined: 1 month
    Last seen: 4 days

    Hi,

    Thanks a lot for the support bro :)

    I have a little issue with timespan grabbing.
    I use a timespan of "10" for grab 11 day in my webgrab++conf.fxml, but since i use your canalsat.ini, i have an error.

    Other ini files can grab for more than 11 days, but if i setup timespan = 10 and add a channel on canalsat, webgrabb++ kill the processus...

    How to operate my global conf to allow grab on 11 days with other ini, and allow maximum day on canalsat without the script crash ?

    [ Info ] update requested for - 1 - out of - 1 - channels for 11 day(s)
    [ Debug ]
    [ Info ] ( 1/1 ) CANALSAT.FR -- chan. (xmltv_id=Olympia.canal) -- mode Incremental
    [Error ] Unable to update channel Olympia.canal
    [Critical] Generic syntax exception:
    [Critical] message:
    [Error ] no index page data received from Olympia.canal
    [Error ] unable to update channel, try again later
    [ Info ] Existing guide data restored!
    [ Debug ]
    [ Debug ] 0 shows in 1 channels
    [ Debug ] 0 updated shows
    [ Debug ] 0 new shows added
    [ Info ]
    [ Info ]
    [ ] Job finished at 22/02/2020 12:57:57 done in 2s
    [ Debug ] statistics upload error: Le serveur distant a retourné une erreur : (404) Introuvable.

    Have you an idea to correct this error ?

    ----------

    As season and episodes numbers are in title, i have modify the script ini to grab it on episode-num section

    Quote:

    episode.scrub {single|{"currentPage":{"displayTemplate":"detailSeason","displayName":"||","path":"}
    episode.modify {set|'index_title'}
    episode.modify {substring(pattern="S'S1' Ep'E1'" type=regex)|'episode' "\s(S\d+\s-\sÉp\s\d+)"}
    episode.modify {replace|S|s}
    episode.modify {replace| - Ép |.e}
    episode.modify {cleanup}

    results :

    episode-num system="onscreen">s1.e5

    mat8861
    Offline
    WG++ Team memberDonator
    Joined: 4 years
    Last seen: 11 hours

    Well most of the time, not all the channels have 11 or 14 days, so is better to grab 7 days. I personally grab 3 days as i think is useless to grab so much data, at least i check whats'on that day or day after. Lots of time to grab with no advantage, but again this is my personal need and way to use epg.
    "episode-num system="onscreen">s1.e5"
    The xmltv standard (onscreen) is S1 E5 or in xmltv_ns format both supported by wg++, then there is ddprogid (not very common) but if you like that style...up to you.

    poloche63
    Offline
    Joined: 1 month
    Last seen: 4 days

    How can i regex multiples patterns for Seasons/Episodes style in canalsat epg ?

    There are 3 patterns available in title :

    1 - "S1 - Ép 2" for Seasons and episodes (ex : Thérapie - S1 - Ép 2) -> \s(S\d+\s-\sÉp\s\d+)
    2 - "Ép 1" for Episodes only (ex : Hustle - Ép 1) -> \s(Ép\s\d+)
    3 - "S1" for Seasons only (ex : Vice - S1) -> \s(S\d+)

    Actually with the code below :

    Quote:

    episode.scrub {single|{"currentPage":{"displayTemplate":"detailSeason","displayName":"||","path":"}
    episode.modify {set|'index_title'}
    episode.modify {substring(pattern="S'S1' Ep'E1'" type=regex)|'episode' "\s(S\d+\s-\sÉp\s\d+)"}

    I have only the first pattern in my xml, other are not catched... :s
    I would like to check the 3 patterns to catch the 3 formats

    If pattern 1 exists, set it as 'episode'
    If not, check the pattern 2 and set it as 'episode'
    If not, check the pattern 3 and set it as 'episode'
    If not, keep 'episode' empty

    How do that :)

    thank for help

    mat8861
    Offline
    WG++ Team memberDonator
    Joined: 4 years
    Last seen: 11 hours

    try

    episode.modify {substring(pattern="S'S1' Ep'E1'" type=regex)|'episode' "-\s(?:S\d+(?:\s-\s)*)?(?:[ÉEe]p\s\d+)*"}
    you may need (don't remember)
    episode.modify {remove(type=regex)|\s-\s}

    poloche63
    Offline
    Joined: 1 month
    Last seen: 4 days
    mat8861 wrote:

    try
    episode.modify {substring(pattern="S'S1' Ep'E1'" type=regex)|'episode' "-\s(?:S\d+(?:\s-\s)*)?(?:[ÉEe]p\s\d+)*"}
    you may need (don't remember)
    episode.modify {remove(type=regex)|\s-\s}

    Nice mat8861 :)

    Works fine !

    I have moded the initial version of bellicheone to add season/episodes numbers in right sections, and i have remove the info in title

    If people want to have correctly seasons/episodes in xml and not in only in title, take the ini in attachments ^^

    Thanks for all my guys

    Attachments: 
    mat8861
    Offline
    WG++ Team memberDonator
    Joined: 4 years
    Last seen: 11 hours

    change :
    episode.modify {remove(type=regex)|\s-\s}
    into :
    episode.modify {remove(type=regex)|\s?-\s}
    remove :
    episode.modify {remove(type=regex)|-\s}

    poloche63
    Offline
    Joined: 1 month
    Last seen: 4 days

    OK matt8861

    So, i have other issue, i would like to add at start of description the 'episode' variable.

    Quote:

    description.modify {addstart('episode' not ""))|'episode': }

    the results are :

    --> in case of description is not empty : "s1.e4: Tyler the Creator explore ce qu'il aime, et leur fonctionnement à la rencontre d'experts qui l'aident à inventer et à innover comme seul Tyler le pouvait."

    --> in case of description is empty : "s1.4:."

    It's possible to remove ":." at the end of case empty description ? --> "s1.e4:." ---> "s1.4"
    Else if, only remove the dot "." ? --> "s1.e4:." ---> "s1.4:"

    Thanks

    mat8861
    Offline
    WG++ Team memberDonator
    Joined: 4 years
    Last seen: 11 hours

    The dot i guess is coming from description, as you are adding something.
    So i would do:
    temp_1.modify {addstart|'episode':} *gets the : and become part of episode as temp
    description.modify {addstart('temp_1' not "")|'temp_1'}
    result:
    --> in case of description is empty : " "
    --> in case of description is not empty : "s1.e4: Tyler the Creator....
    in case you don't want the dot between s1.e4 use another mod between temp_1 and description:
    temp_1.modify {remove(type=regex)|\.}

    poloche63
    Offline
    Joined: 1 month
    Last seen: 4 days
    mat8861 wrote:

    The dot i guess is coming from description, as you are adding something.
    So i would do:
    temp_1.modify {addstart|'episode':} *gets the : and become part of episode as temp
    description.modify {addstart('temp_1' not "")|'temp_1'}
    result:
    --> in case of description is empty : " "
    --> in case of description is not empty : "s1.e4: Tyler the Creator....
    in case you don't want the dot between s1.e4 use another mod between temp_1 and description:
    temp_1.modify {remove(type=regex)|\.}

    Hummm, when i have a dot in the description after my addstart element 'episode', in log file, we have not description element found.

    [ Debug ] Debugging information SiteIni
    [ Debug ] Element: DESCRIPTION
    [ Debug ] html source written to : C:\Users\xxx\AppData\Local\WebGrab+Plus\html.source.htm
    [ Debug ] scrub strings:
    [ Debug ] type & arguments : single (debug)
    [ Debug ] blockstart (bs): "summary":"
    [ Debug ] elementstart (es):
    [ Debug ] elementend (ee): ",
    [ Debug ]
    [ Debug ] No Block with these separators

    So the dot not come from description...
    And your two solutions to remove the dot not work. I always have the dot after the episode element if the description is not found in html source. :s

    Log in or register to post comments

    Brought to you by Jan van Straaten

    Program Development - Jan van Straaten ------- Web design - Francis De Paemeleere
    Supported by: servercare.nl