**------------------------------------------------------------------------------------------------ * @header_start * WebGrab+Plus ini for grabbing IMDB data from TvGuide websites * @MinSWversion: V1.1.1/55.27 * @Site: imdb.com, primary search with bing.com * @Revision 8 - [25/09/2015] Jan van Straaten * - added mdb_category * @Revision 7 - [10/10/2014] Jan van Straaten * - improved showid scub, also numbers upto 5000000 (was 2500000) * @Revision 6 - [09/06/2014] Jan van Straaten * - added url header * @Revision 5 - [07/06/2014] Jan van Straaten/Jagad * - added mdb_showicon * @Revision 4 - [20/12/2013] Jan van Straaten * - changes in (aka)titles * @Revision 3 - [23/11/2013] Jan van Straaten * - changes in actor and director due to site changes * @Revision 2 - [11/08/2013] Jan van Straaten * - small changes in title, actor and commentsummary due to imdb.com changes * @Revision 1 - [14/04/2012] Jan van Straaten * - correction in production date * @Remarks: none * @header_end **------------------------------------------------------------------------------------------------ * * site {url=imdb.com|cultureinfo=en-GB|charset=UTF-8|matchfactor=60|searchsite=bing} * primary search: url_primarysearch {url|http://www.bing.com/search?q=|imdb+title/tt+|'title'|+|'productiondate'|+|'credit'|&scope=web&setmkt=en-US&qs=ns&form=QBRE&qb=2} *scope=web&setmkt=es-ES&setlang=match show_id.scrub {regex()|primary||title/tt(\d{7})/||} * * filter showid (7 char long): show_id.modify {remove| } * remove spaces mdb_temp_1.modify {calculate(type=element format=F0)|'show_id' #} * number of show_id's = loop index loop {('mdb_temp_1' > "0" max=50)|4} mdb_temp_1.modify {calculate(format=F0)|1 -} * decrease index mdb_temp_2.modify {substring(type=element)|'show_id' 'mdb_temp_1' 1} * the showid to inspect mdb_temp_3.modify {calculate(type=char format=F0)|'mdb_temp_2' #} * how many chars in this show_id? show_id.modify {remove('mdb_temp_3' not "7" type=element)|'show_id' 'mdb_temp_1' 1} * remove this show_id if not 7 chars * end loop * filter showid (only numbers and < 5000000): mdb_temp_1.modify {calculate(type=element format=F0)|'show_id' #} * number of show_id's = loop index loop {('mdb_temp_1' > "0" max=50)|5} mdb_temp_1.modify {calculate(format=F0)|1 -} * decrease index mdb_temp_2.modify {substring(type=element)|'show_id' 'mdb_temp_1' 1} * the showid to inspect mdb_temp_3.modify {calculate(format=F0)|'mdb_temp_2'} * convert to number show_id.modify {remove('mdb_temp_3' "0" type=element)|'show_id' 'mdb_temp_1' 1} * remove this show_id if not only numbers show_id.modify {remove('mdb_temp_3' > "5000000" type=element)|'show_id' 'mdb_temp_1' 1} * remove this show_id if > 5000000 * end loop * * imdb url's: url_mdb_p1 {url|primary|http://www.imdb.com/title/tt|show_id|/} *url_mdb_p2.modify {addstart|'url_mdb_p1'plotsummary} *url_mdb_p3.modify {addstart|'url_mdb_p1'releaseinfo#akas} *url_mdb_p4.modify {addstart|'url_mdb_p1'reviews} *url_mdb_p5.modify {addstart|'url_mdb_p1'fullcredits#cast} * url_mdb_p2 {url|primary|http://www.imdb.com/title/tt|show_id|/plotsummary} url_mdb_p3 {url|primary|http://www.imdb.com/title/tt|show_id|/releaseinfo#akas} url_mdb_p4 {url|primary|http://www.imdb.com/title/tt|show_id|/reviews} url_mdb_p5 {url|primary|http://www.imdb.com/title/tt|show_id|/fullcredits#cast} * url_mdb.headers {customheader=Accept-Encoding=gzip,deflate} * * imdb elements mdb_title.scrub {single(separator=":")|p1|||}* original title when redirected mdb_title.modify {cleanup(tags="/=\""} * removes starting " mdb_title.scrub {single(separator=" - " exclude="IMDb" include=first)|p1|
|| | |
|
\n\n|