Yes
Is this a BUG / ISSUE report or a QUESTION?BUG
System information. For client/server mode post info for both machines. Your borg version (borg -V).borg 1.1.10
Operating system (distribution) and version.macos 10.14.6
Hardware / network configuration, and filesystems used.MBP, Mac OS Extended
How much data is handled by borg?1.5 TB
Full borg commandline that lead to the problem (leave away excludes and passwords)borg init -e none test.borg
mkdir test
filename=`printf 'la\xcc\x88.txt'`
echo "hello" >test/$filename
borg create test.borg::a test
mkdir test2
borg mount test.borg::a test2
ls test
# lä.txt
ls test2/test
# lä.txt
cat test/lä.txt
# hello
cat test2/test/lä.txt
# cat: test2/test/lä.txt: No such file or directory
So there is a file we can list but not open. I have the same problem with a real archive.
ThoughtsSo the problem here is likely related to https://code.google.com/archive/p/macfuse/issues/139#c2: a mixup between precomposed and decomposed UTF encodings.
In fuse.py
, in lookup
, we have this:
inode = self.contents[parent_inode].get(name)
Here we get name encoded as b'l\xc3\xa4.txt'
(precomposed). But in the archive, in self.contents[parent_inode]
we have {b'la\xcc\x88.txt': 1000041}
(decomposed). Both forms are technically equivalent:
>>> os.fsdecode(b'l\xc3\xa4.txt')
'lä.txt'
>>> os.fsdecode(b'la\xcc\x88.txt')
'lä.txt'
>>> unicodedata.normalize("NFD", os.fsdecode(b'l\xc3\xa4.txt')) == unicodedata.normalize("NFD", os.fsdecode(b'la\xcc\x88.txt'))
True
I would have liked to submit a patch but I honestly don't know the correct way to deal with this. If I add name = os.fsencode(unicodedata.normalize("NFD", os.fsdecode(name)))
in lookup
, it works for this particular error. But it would break if the archive used NFC instead.
We could also normalize
the encodings before writing them to self.contents
. This would work in both cases. But what encoding is supposed to be used to begin with?
RetroSearch is an open source project built by @garambo | Open a GitHub Issue
Search and Browse the WWW like it's 1997 | Search results from DuckDuckGo
HTML:
3.2
| Encoding:
UTF-8
| Version:
0.7.4