You are here

MDB not matching very obvious movies...please advice

9 posts / 0 new
Last post
karimf
Offline
Joined: 10 years
Last seen: 8 years
MDB not matching very obvious movies...please advice

Cheers to all,

I've been struggling with the MDB postrprocess in getting many movies matched.

What baffles me is that some obvious movies with no similar names/details don't get matched.

I'm using : <site movies="imdb.com.ask,imdb.com.imdb" series="imdb.com.imdb_series"></site>

And I use it for movie channels only, no series in these channels.

Examples from log file:

Movie: "In the line of fire" + (1993) + ""...no match

Movie: "Twister" + (1996) + ""...no match

Notice correct title and productiondate are recognized but NOT matched.

The setting in mdb:

<selectmovie duration="55" minumum="2" musthave="title" contains=" " optional="productiondate,actor,director"/>
<selectserie duration="20" minumum="2" musthave="title" contains="series,serie,soap,thriller,comedy,drama" optional="productiondate,actor,director"/>

<matchmovie mustmatch="title" optional="productiondate,actor,director" minimum="2"/>
<matchserie mustmatch="subtitle" optional="productiondate,actor,director" minimum="2"/>

This is very strange as you can see from the above examples that they have unique titles and dates and the postprocess sees them and they don't conflict with anything on the imdb site. Still they're don't get a match.

Also, the matched movies don't get the actors scrubbed at all.

What am I doing wrong?

Can someone please give some assistance?

Thanks in advance.

WGMaker
Offline
WGMaker's picture
WG++ Team memberDonator
Joined: 12 years
Last seen: 8 hours
Is the support helpful?
support us

Hi, can you upload the (a) xmltv input file with at least a few of the shows that don't match. (It doesn't matter if they are from a while ago)
There is also a small chance that the ​shows are listed as <no-match> in your ldb (local database file) file on some earlier occasion. Have a look at entries like this:
  <no-match>
    <entry-date>20151106</entry-date>
    <attempts>3</attempts>
    <searchstring>Alien 3 +  David Fincher</searchstring>
  </no-match>
If the value of <attempts> is 3 or more the postprocessor will assume the no match is hopeless and won't try again. To give it another try, either remove these entries, or change the <attempts> value to 0
Then run again (you can set grab="n", as in <postprocess run="y" grab="n">mdb</postprocess> in the webgrab++.config to avoid grabbing again)

 

Jan

karimf
Offline
Joined: 10 years
Last seen: 8 years

Hi Jan,

Thanks for your reply.

Attached is a sample of some shows that didn't get a a match.

I also noticed that if the 'director' name (in the imdb website) includes a letter like "ö, à, etc..." it is doesn't get a match. A letter encoding issue that I think can't be solved.

I tried increasing the 'mustmatch' minimum to 3 and 4 to force it to include 'actor' but the <searchstring> always uses title + productiondate + director.

How to force it to use the 'actor' element ?? Maybe this will get a match.

The interesting part is that some shows with normal letter encoding, right title, director, productiondate don't get a match. It is not a lot of shows. Very few. But it is strange behavior I think.

Thanks again.

 

Attachments: 
WGMaker
Offline
WGMaker's picture
WG++ Team memberDonator
Joined: 12 years
Last seen: 8 hours
Is the support helpful?
support us

Hi Karimf,

I ran your sample file, the results:
with <site movies="imdb.com.ask, imdb.com.imdb" series="imdb.com.imdb_series"></site> 
6 out of 8 matches. Not such a bad score!!  No match for
- The Abyss: Special Edition   and
- S1MØNE
​I explain why :
The Abyss: Special Edition 
had to be matched with the IMDb name The Abyss (IMDB lists this special edition with that name)
I won't try to explain all the details of the title match algorithm but due to a small error in it it didn't match , because the two names were too different. However because of the 'names separaotor' : between The Abyss and Special Edition it should have matched.
The matching algoritme ​is corrected and will be part of the next beta 55.8, but that might take some time because it is a too small update for a new beta.

S1MØNE  no match for that is because the primary search engine, ASK, doesn't list the IMDb entry of this movie. Nothing you can do about that.
In this case using BING as primary search engine solves the problem. ​

So in the end​ 8 matches out of 8 with
​<site movies="imdb.com.bing, imdb.com.imdb" series="imdb.com.imdb_series"></site> 

Your other remarks:
- characters like "ö, à,  etc..."  so called accented characters are converted into their not accented equivalent o, a​ etc. So I cant explain why you notice any no-matches due to that.
- Actors and director.The matching algoritm uses a matching value that starts with 0. It first looks for the 'mustmatch' elements. They must all match to go on. Then it looks for the optionals.  If your xmltv has both director and actor(s), the matching algorithm will start with the director. If it finds a match the ​matching value will be increased by 1, if the director doesn't match the value stays the same.It then looks at the actors in the same way (it scans all the actors untill a match is found). You don't need to force anything , all the optionals are processed automatically. In the end , the matching value is compared with 'minimum'. Equal or higher = match

 

Jan

​​
​​

karimf
Offline
Joined: 10 years
Last seen: 8 years

Hi Jan,

Thanks for your help and support.

Now, how come you got 6 of 8 matches while I didn't for those same movies? Using the same <site movies="imdb.com.ask, imdb.com.imdb" series="imdb.com.imdb_series"></site> . This is strange.

Do you have any explanations for this ?

Thanks.

 

 

WGMaker
Offline
WGMaker's picture
WG++ Team memberDonator
Joined: 12 years
Last seen: 8 hours
Is the support helpful?
support us

Two reasons I can think of:

1. The unmatched shows are already marked as <no-match> in your ldb database file (maybe due to some earlier tryal runs with other setting). Try to locate them and remove the <no-match> to </no-match> block of these shows
Example:
<no-match>
    <entry-date>20150912</entry-date>
    <attempts>4</attempts>
    <searchstring>Rachael Ray * + Chef Curtis Stone Is Rachael's Co-Host, and They're Giving You the 411 on Everything From Food to Fall Organizing! + (2015) + </searchstring>
  </no-match>

As alternative, you can also set the attempts value to 0

1. Too high must match value in mdb config? A value of 2 is enough not to get too many 'false' matches.

Jan

francis
Offline
francis's picture
Has donated long time agoWG++ Team member
Joined: 12 years
Last seen: 1 week
Is the support helpful?
support us
karimf wrote:

Do you have any explanations for this ?

Because his name is Jan? wink

karimf
Offline
Joined: 10 years
Last seen: 8 years

Hi Francis,

I'm smiling to your funny comment :) I think it's maybe because the site knows that the grab is coming from the WGMAKER so it has to work :))

Anyways my friends, I have to thank both of you for your continuos support and effort you give to all of us at this friendly community of WG+.

karimf
Offline
Joined: 10 years
Last seen: 8 years

Hi Jan,

Thanks for your advice, I did it before and it worked.

Still I have one remark that may be useful to you as a developper.

Some movies don't get a match, but if I just run the postprocess again 30 minutes later it gets the matches. Even without removing the <no-match> entries (the <attempts> is 1).

I don't understand it, but the overall match rate of the first run is always above 95% which is a fantastic rate by all means.

Thanks again Jan.

Log in or register to post comments

Brought to you by Jan van Straaten

Program Development - Jan van Straaten ------- Web design - Francis De Paemeleere
Supported by: servercare.nl