Item: WALL·E
Language: en-US
Type of Problem: Design_issue
Extra Details: Since the modification of the Title to Wall.E from Wall-E the movie no longer scrapes properly, even if the file uses a valid alternative title. Propose to modify title back to Wall-E to improve scraping and for consistency with historical title used. Additionally, most file systems cannot support a (.) in the file name compounding scrape errors. New title also results in movie not coming up in related search within TMDB website too, reinforcing error caused by using a non-standard character in the title. Since the change was recent (a few months ago) and the movie is from 2008 the error may not be widely seen yet and only upon database rebuilds.
Can't find a movie or TV show? Login to create it.
Want to rate or add this item to a list?
Not a member?
Reply by superboy97
on July 6, 2023 at 12:17 AM
If I search for "wall.e", the movie title is directly suggested by the search engine, and the movie is found. (This screenshots have been taken in French, but it's exactly the same in English).
The "." in the filename is perfectly supported by filesystems (Windows, Linux, ...). Even "WALL·E" can be perfectly used as a filename.
For scraping problems, you need to contact the support team of the scraper you are using.
Reply by bearclaws8
on July 6, 2023 at 6:46 AM
I will reach out to the scraping team separately. But I believe there is still an issue with the current title and the website search, which may be impacting other things.
Searching for "wall.e", "WALL.E", or any other iteration with (.) is the only method in which the proper title is provided by the search engine on the first page, at least in English. Searching for "WALL-E" and "Wall E", the other two alternate titles in English, places the proper movie on page 9 of the search results. In fact, my quick trial of any iteration not using (.) placed the proper movie on page 9 of the results and favored many movies with titles that did not have "wall" in them over the correct film. Few users will expect the proper movie to be so far into the results. This is in contrast to the prior performance, which used to place it on the first page before implementing the (.) (I admit that this is from memory). The typical convention for this movie has been to list it's title with a (-) character, including by the movie owner.
I can appreciate trying for accuracy in the title by using (·) but it appears that the character is not being handled properly by the search algorithm and may also be causing other downstream effects. Is it possible that using the (·) character is causing issues more than if it was (.), considering how it may be being handled in the background?
Reply by neoeinstein
on September 9, 2023 at 7:48 PM
I would add that the title as suggested by the released Blu-ray discs suggest that the appropriate name uses a different glyph: WALL•E (U+2022 Bullet), instead of WALL·E (U+00B7 Middle Dot). This is the glyph used for the Spanish translation, and it is the glyph suggested by the makers of the movie and its released physical media.