Possible bug in IMDB scraper (quick fix included)

Discussion of XBMC4XBOX development.
Post Reply
User avatar
Indiana_Jones
Posts: 5
Joined: Sat Sep 15, 2012 1:14 pm

Possible bug in IMDB scraper (quick fix included)

Post by Indiana_Jones »

Hello! :)

Not sure if I'm in the right section since it's not "Development" but fixing an issue in an XML file.

After upgrading from XBMC 3.1 to 3.2 on my good old XBOX, I started to finally sort and reorganize my movie collection.
While using the IMDB scraper, I noticed that with the "Enable full cast credits" option, the "Cast" tab for the movie was empty after fetching the data.

I had a look at the debug Log and saw this:

13:43:03 M: 25640960 DEBUG: scraper: GetIMDBCast returned <details></details>

Clearly empty. :-)

I compared the imdb.xml from the system/commons-folder with the one from the most recent windows-Build metadata.common.imdb.com folder, and to cut it short, the method for fetching the actors is different in certain places.

To fix it, I just copied the contents of routine ParseIMDBFullCast from the Windows-Build into the GetIMDBCast from the XBox-Build. Now the "Full cast" option is working again and actors are properly fetched.

Replace the subroutine GetIMDBCast in system/scrapers/video/commons/imdb.xml with this one:
*Edit: Path to xml file corrected

http://pastebin.ca/2204464
Sorry for the weird formatting, I just couldn't get it right. :?

Afterwards, from the debug Log:

14:06:41 M: 24702976 DEBUG: scraper: GetIMDBCast returned <details><actor><thumb>http://ia.media-imdb.com/images/M/MV5BM ... e>Terrence Mann</name><role>Ug</role>...etc...

I don't know if this is suffcient, maybe a developer can comment.
If further details are needed I'll gladly provide the Logfiles.

Best regards
User avatar
BuZz
Site Admin
Posts: 1891
Joined: Wed Jul 04, 2012 12:50 am
Location: UK
Has thanked: 66 times
Been thanked: 423 times
Contact:

Re: Possible bug in IMDB scraper (quick fix included)

Post by BuZz »

thanks, but - did you check the svn version first? I seem to remember already fixing this not long ago. any patches / fixes need to be against the dev release, and you may well find its already sorted there.
User avatar
Indiana_Jones
Posts: 5
Joined: Sat Sep 15, 2012 1:14 pm

Re: Possible bug in IMDB scraper (quick fix included)

Post by Indiana_Jones »

Hi BuZz,

yes, I compared the one from the trunk with the one tagged for the 3.2 release, and they were
identical. (Rev 31136)

I checked again in the XBMC4XBOX SVN:

Version 31101 has a version that is similar to my "fixed" version for fetching the full cast.
Version 31136 has the version that is currently in the 3.2 release and on the trunk.

I also now checked the XBMC mainline in Github.

After comparing the files I think I know what's the issue.

The mainline has 2 functions to get the cast (ParseIMDBCast and ParseIMDBFullCast).
The XBOX one only has one (GetIMDBCast)
So this has been copied in the 2 Versions:
Version 31101: GetIMDBCast (Xbox) == ParseIMDBFullCast (XBMC Mainline)
Version 31136: GetIMDBCast (Xbox) == ParseIMDBCast (XBMX Mainline)

So either one or the other was working with this.
To verify, I disabled the "Enable full cast & credits" option with my "fixed" version
and ran the scan again, now the actors were missing.

As a new fix I did this:
I took the version delivered with the 3.2 release as baseline again and
1) Added new function "GetIMDBFullCast" to the imdb.xml in the "common" dir that is the same as ParseIMDBFullCast from the XBMC mainline
http://pastebin.ca/2204493
2) Changed line 123 the imdb.xml in the video scraper to call this new Function for the "fullcredits.htm"
http://pastebin.ca/2204495

Now it works with the "Full cast" option enabled or disabled.
Option disabled: Actors are fetched
Option enabled: Cast & Director are fetched

Best regards
User avatar
BuZz
Site Admin
Posts: 1891
Joined: Wed Jul 04, 2012 12:50 am
Location: UK
Has thanked: 66 times
Been thanked: 423 times
Contact:

Re: Possible bug in IMDB scraper (quick fix included)

Post by BuZz »

I can only say that when I did the change in r31136 cast scraping did work, but i may not have tested with / without full cast - appreciate the feedback.

imdb changes their site around a lot, and so the regular expressions need updating (and I normally grab them from mainline if I can as you have done). if you post diffs against svn HEAD it will be easier for me to see what you have changed in regards to the two functions, rather than posting modified files. thanks. (Opening a ticket on the bugtracker for the patches would be the fastest route to get them into the codebase). cheers.
User avatar
Dan Dar3
Posts: 1176
Joined: Sun Jul 08, 2012 4:09 pm
Has thanked: 273 times
Been thanked: 257 times
Contact:

Re: Possible bug in IMDB scraper (quick fix included)

Post by Dan Dar3 »

@Indiana_Jones
see this on how to create a diff / patch with TortoiseSVN, if that's what you use.
http://tortoisesvn.net/docs/release/Tor ... patch.html
User avatar
BuZz
Site Admin
Posts: 1891
Joined: Wed Jul 04, 2012 12:50 am
Location: UK
Has thanked: 66 times
Been thanked: 423 times
Contact:

Re: Possible bug in IMDB scraper (quick fix included)

Post by BuZz »

thanks dan - indy - I have approved your redmine account.
User avatar
Indiana_Jones
Posts: 5
Joined: Sat Sep 15, 2012 1:14 pm

Re: Possible bug in IMDB scraper (quick fix included)

Post by Indiana_Jones »

Hi,

@Dan Dar3
Thanks for the HowTo.
So far I just downloaded the files manually and did the diff on the CLI.
I accessed the files through the Browser from the webpage so far.
Whil I'm familiar with version control, I have no experience with SVN, nor setup anything to access the XBMC repository.
I was planning on posting the diff logs from the CLI if that is sufficient. :)

Best regards
User avatar
BuZz
Site Admin
Posts: 1891
Joined: Wed Jul 04, 2012 12:50 am
Location: UK
Has thanked: 66 times
Been thanked: 423 times
Contact:

Re: Possible bug in IMDB scraper (quick fix included)

Post by BuZz »

sorry missed this last post - I have written on the bugtracker regarding diffs, but I need diffs against svn, made with svn (or tortoise svn on windows). This is the only sensible way really. checking out svn modifying some files and making a diff is pretty simple and there are lots of tutorials regarding this. Thanks.
User avatar
Indiana_Jones
Posts: 5
Joined: Sat Sep 15, 2012 1:14 pm

Re: Possible bug in IMDB scraper (quick fix included)

Post by Indiana_Jones »

Hi,

thanks for the feedback.
I made the Diffs with Mac OS, so Unix.

But I also have a Windows machine, I will setup TortoiseSVN and follow Dan's instructions.
Hope I'll get it done this week, I'll update the ticket then.

Best regards
User avatar
BuZz
Site Admin
Posts: 1891
Joined: Wed Jul 04, 2012 12:50 am
Location: UK
Has thanked: 66 times
Been thanked: 423 times
Contact:

Re: Possible bug in IMDB scraper (quick fix included)

Post by BuZz »

you can do it on macos. just use whatever is a popular svn client on it. commandline svn on osx would be fine - a simple svn checkout, add your changes, and "svn diff" from the commandline redirecting output to a mypatch.diff
User avatar
Indiana_Jones
Posts: 5
Joined: Sat Sep 15, 2012 1:14 pm

Re: Possible bug in IMDB scraper (quick fix included)

Post by Indiana_Jones »

Hi,

I updated it.
Hope this is usable now. :)

Best regards
User avatar
WhiteSpy
Posts: 16
Joined: Sun Oct 28, 2012 2:55 pm

Re: Possible bug in IMDB scraper (quick fix included)

Post by WhiteSpy »

Where do I download the working IMDB scraper from?
Post Reply