You are here

gato.tv is not removing all the html tags

2 posts / 0 new
Last post
gebriagu
Offline
Donator
Joined: 3 years
Last seen: 2 years
gato.tv is not removing all the html tags

This site (gatotv) is not parsing all the data from the webpage, it is not removing all the html tags from the original site.
I am using the latest version of wg++ (3.0.2), running on Mac OS High Sierra.

This example is extracted from the resulting guide.xml. As you can see, it is not removing some html tags:

<programme start="20200506140000 -0500" stop="20200506143000 -0500" channel="Canal Telesistema 11 (República Dominicana)">
<title lang="es">Los Simpson&lt;/h1&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style="vertical-align:top;"&gt;
&lt;td class="tbl_EPG_th"&gt;
&lt;div style="width:160px; margin-bottom:3px;"&gt;

&lt;a href="https://www.gatotv.com/programa/los_simpson/poster"&gt;
&lt;img itemprop="image" class="img_program" src="https://imagenes.gatotv.com/categorias/programas/los_simpson.png" width="160" height="240" alt="Los Simpson" title="Los Simpson" /&gt;
Ver Poster
&lt;/a&gt;

&lt;/div&gt;
&lt;/td&gt;
&lt;td class="td_basic_info"&gt;
&lt;div&gt;

&lt;span itemprop="dtreviewed" datetime="2013-02-10"&gt;&lt;a class="a_year" href="https://www.gatotv.com/programas/ano/1989" &gt;&lt;span itemprop="copyrightYear"&gt;1989&lt;/span&gt;&lt;/a&gt;&lt;/span&gt;

&lt;span itemprop="reviewRating" itemscope itemtype="http://schema.org/Rating"&gt;
&lt;meta itemprop="worstRating" content="1" /&gt;
&lt;meta itemprop="ratingValue" content="9" /&gt;
&lt;meta itemprop="bestRating" content="10" /&gt;
&lt;img title="9/10" alt="9/10" src="https://imagenes.gatotv.com/half_stars_9.png" /&gt;
&lt;/span&gt;

&lt;strong&gt;en&lt;/strong&gt; &lt;a itemprop="author" itemscope itemtype="http://schema.org/WebSite" href="https://www.gatotv.com/"&gt;&lt;span</title>
<desc lang="es">Una parodia satírica que cuenta las aventuras de los famosos personajes Homero, Marge, Bart, Lisa y Maggie de Springfield.(n)</desc>
<date>1989</date>
<category lang="es">Comedia</category>
<category lang="es">Caricatura</category>
<category lang="es">Programa</category>
<category lang="es">Serie</category>
<icon src="https://imagenes.gatotv.com/categorias/programas/miniatura/los_simpson.png" />
</programme>

Attachments: 
mat8861
Offline
WG++ Team memberDonator
Joined: 8 years
Last seen: 2 days

i see you disabled some lines, try to reenable them

Log in or register to post comments

Brought to you by Jan van Straaten

Program Development - Jan van Straaten ------- Web design - Francis De Paemeleere
Supported by: servercare.nl