Start: Properly hash thumbnail filenames

Image filenames should be encoded as URI before being hashed.

Also assume that filenames are already utf8 because the community
has been advertiseing utf8 usage since the beginning of this centry.

Calling .encode("utf8") on strings that are already in utf8 simply
raises the following exception:

  UnicodeDecodeError: 'ascii' codec can't decode byte ...
for non-ascii (already utf8) strings.

It's in fact impossible to precisely determine pathname encoding
because different components within the path may have different
encodings, e.g., a utf8 directory name followed by an MBCS filename
is valid on Linux native filesystems.
It's the user's responsibility to keep the iocharset consistent.
This commit is contained in:
GUAN Xin
2021-07-21 16:49:06 +08:00
parent 5bef06f67a
commit dab4168008

View File

@@ -109,8 +109,8 @@ def getInfo(filename):
import gnomevfs
except Exception:
# alternative method
import hashlib
fhash = hashlib.md5(("file://"+path).encode("utf8")).hexdigest()
import hashlib, urllib
fhash = hashlib.md5(urllib.quote("file://"+path,safe=":/")).hexdigest()
thumb = os.path.join(os.path.expanduser("~"),".thumbnails","normal",fhash+".png")
else:
uri = gnomevfs.get_uri_from_local_path(path)