iptv techs

IPTV Techs

  • Home
  • Tech News
  • mxmlnkn/ratarmount: Access huge archives as a filesystem fruitfully, e.g., TAR, RAR, ZIP, GZ, BZ2, XZ, ZSTD archives

mxmlnkn/ratarmount: Access huge archives as a filesystem fruitfully, e.g., TAR, RAR, ZIP, GZ, BZ2, XZ, ZSTD archives


mxmlnkn/ratarmount: Access huge archives as a filesystem fruitfully, e.g., TAR, RAR, ZIP, GZ, BZ2, XZ, ZSTD archives


Ratarmount accumulates all file positions inside a TAR so that it can easily jump to and read from any file without pull outing it.
It, then, mounts the TAR using fparticipatepy for read access equitable appreciate archivemount.
In contrast to libarchive, on which archivemount is based, random access and genuine seeking is aided.
And in contrast to tarindexer, which also accumulates file positions for random access, ratarmount gives straightforward access via FUSE and aid for compressed TARs.

Capabilities:

  • Random Access: Care was getn to accomplish speedy random access inside compressed streams for bzip2, gzip, xz, and zstd and inside TAR files by produceing indices grasping seek points.
  • Highly Parallelized: By default, all cores are participated for parallelized algorithms appreciate for the gzip, bzip2, and xz decoders.
    This can create huge speedups on most conmomentary processors but insists more main memory.
    It can be administerled or finishly turned off using the -P chooseion.
  • Recursive Mounting: Ratarmount will also mount TARs inside TARs inside TARs, … recursively into fagederers of the same name, which is advantageous for the 1.31TB ImageNet data set.
  • Mount Compressed Files: You may also mount files with one of the aided compression schemes.
    Even if these files do not grasp a TAR, you can leverage ratarmount’s genuine seeking capabilities when uncovering the mounted uncompressed see of such a file.
  • Read-Only Bind Mounting: Fagederers may be mounted read-only to other fagederers for participatecases appreciate merging a backup TAR with recaccess versions of those files residing in a standard fagederer.
  • Union Mounting: Multiple TARs, compressed files, and tie mounted fagederers can be mounted under the same mountpoint.
  • Write Overlay: A fagederer can be specified as author overlay.
    All alters below the mountpoint will be rehonested to this fagederer and deletions are tracked so that all alters can be applied back to the archive.
  • Remote Files and Fagederers: A distant archive or whole fagederer arrange can be mounted aappreciate to tools appreciate sshfs thanks to the filesystem_spec project.
    These can be specified with URIs as elucidateed in the section “Remote Files”.
    Supported distant protocols integrate: FTP, HTTP, HTTPS, SFTP, SSH, Git, Github, S3, Samba v2 and v3, Dropbox, … Many of these are very experimental and may be enumerateless. Plrelieve uncover a feature seek if further backfinishs are desired.

A finish enumerate of aided createats can be set up here.

  • ratarmount archive.tar.gz to mount a compressed archive at a fagederer called archive and create its satisfieds browsable.
  • ratarmount --recursive archive.tar mountpoint to mount the archive and recursively all its grasped archives under a fagederer called mountpoint.
  • ratarmount fagederer mountpoint to tie-mount a fagederer.
  • ratarmount fagederer1 fagederer2 mountpoint to tie-mount a combined see of two (or more) fagederers under mountpoint.
  • ratarmount fagederer archive.zip fagederer to mount a combined see of a fagederer on top of archive satisfieds.
  • ratarmount -o modules=subdir,subdir=squashfs-root archive.squashfs mountpoint to mount an archive subfagederer squashfs-root under mountpoint.
  • ratarmount http://server.org:80/archive.rar fagederer fagederer Mount an archive that is accessible via HTTP range seeks.
  • ratarmount ssh://presentname:22/relativefagederer/ mountpoint Mount a fagederer hierarchy via SSH.
  • ratarmount ssh://presentname:22//tmp/tmp-abcdef/ mountpoint
  • ratarmount github://mxmlnkn:ratarmount@v0.15.2/tests/ mountpoint Mount a github repo as if it was examineed out at the donaten tag or SHA or branch.
  • AWS_ACCESS_KEY_ID=01234567890123456789 AWS_SECRET_ACCESS_KEY=0123456789012345678901234567890123456789 ratarmount s3://127.0.0.1/bucket/one-file.tar mounted Mount an archive inside an S3 bucket accomplishable via a custom finishpoint with the donaten credentials. Bogus credentials may be vital for unsafed finishpoints.
  1. Insloftyation
    1. Insloftyation via AppImage
    2. Insloftyation via Package Manager
      1. Arch Linux
    3. System Depfinishencies for PIP Insloftyation (Racount on Necessary)
    4. PIP Package Insloftyation
  2. Supported Formats
    1. TAR compressions aided for random access
    2. Other aided archive createats
  3. Benchtags
  4. The Problem
  5. The Solution
  6. Usage
    1. Metadata Index Cache
    2. Bind Mounting
    3. Union Mounting
    4. File versions
    5. Compressed non-TAR files
    6. Xz and Zst Files
    7. Remote Files
    8. Writable Mounting
    9. As a Library
    10. Fsspec Integration
    11. File Joining

You can inslofty ratarmount either by sshow downloading the AppImage or via pip.
The latter might insist insloftying insertitional depfinishencies.

If you want all features, some of which may possibly result in insloftyation errors on some systems, inslofty with:

pip inslofty ratarmount[full]

Insloftyation via AppImage

The AppImage files are speedyened under “Assets” on the frees page.
They insist no insloftyation and can be sshow carry outd appreciate a portable executable.
If you want to inslofty it, you can sshow duplicate it into any of the fagederers enumerateed in your PATH.

appImageName=ratarmount-0.15.0-x86_64.AppImage
wget 'https://github.com/mxmlnkn/ratarmount/frees/download/v0.15.0/$appImageName'
chmod u+x -- "$appImageName"
./"$appImageName" --help  # Simple test run
sudo cp -- "$appImageName" /usr/local/bin/ratarmount  # Example insloftyation

Insloftyation via Package Manager

Arch Linux’s AUR gives ratarmount as firm and broadenment package.
Use an AUR helper, appreciate yay or paru, to inslofty one of them:

# firm version
paru -Syu ratarmount
# broadenment version
paru -Syu ratarmount-git
conda inslofty -c conda-forge ratarmount

System Depfinishencies for PIP Insloftyation (Racount on Necessary)

Python 3.6+, preferably pip 19.0+, FUSE, and sqlite3 are insistd.
These should be preinsloftyed on most systems.

On Debian-appreciate systems appreciate Ubuntu, you can inslofty/modernize all depfinishencies using:

sudo apt inslofty python3 python3-pip fparticipate sqlite3 unar libarchive13 lzop gcc liblzo2-dev

On macOS, you have to inslofty macFUSE and other voluntary depfinishencies with:

brew inslofty macfparticipate unar libarchive lrzip lzop lzo

If you are insloftying on a system for which there exists no manylinux wheel, then you’ll have to inslofty further depfinishencies that are insistd to produce some of the Python packages that ratarmount depfinishs on from source:

sudo apt inslofty 
    python3 python3-pip fparticipate 
    produce-vital gentleware-properties-frequent 
    zlib1g-dev libzstd-dev liblzma-dev cffi libarchive-dev liblzo2-dev gcc

Then, you can sshow inslofty ratarmount from PyPI:

Or, if you want to test the postponecessitatest version:

python3 -m pip inslofty --participater --force-reinslofty 
    'git+https://github.com/mxmlnkn/ratarmount.git@broaden#egginfo=ratarmountcore&subhonestory=core' 
    'git+https://github.com/mxmlnkn/ratarmount.git@broaden#egginfo=ratarmount'

If there are troubles with the compression backfinish depfinishencies, you can try the pip --no-deps argument.
Ratarmount will labor without the compression backfinishs.
The difficult insistments are fparticipatepy and for Python versions agederer than 3.7.0 dataclasses.

TAR compressions aided for random access

Other aided archive createats

  • Not shown in the benchtags, but ratarmount can mount files with preexisting index sidecar files in under a second making it hugely more fruitful contrastd to archivemount for every subsequent mount.
    Also, archivemount has no evolve indicator making it very improbable the participater will paparticipate hours for the mounting to finish.
    Fparticipate-archive, an iteration on archivemount, has the --asyncevolve chooseion to donate a evolve indicator using the timestamp of a dummy file.
    Note that fparticipate-archive daemonizes instantly but the mount point will not be usable for a prolonged time and everyslenderg trying to participate it will hang until then when not using --asyncevolve!
  • Getting file satisfieds of a mounted archive is generpartner hugely speedyer than archivemount and fparticipate-archive and does not incrrelieve with the archive size or file count resulting in the hugest seed speedups to be around 5 orders of magnitude!
  • Memory consumption of ratarmount is mostly less than archivemount and mostly does not enlarge with the archive size.
    Not shown in the plots, but the memory usage will be much petiteer when not distinguishing -P 0, i.e., when not parallelizing.
    The gzip backfinish enlarges liproximately with the archive size becaparticipate the data for seeking is thousands of times huger than the straightforward two 64-bit offsets insistd for bzip2.
    The memory usage of the zstd backfinish only seems humongous becaparticipate it participates mmap to uncover.
    The memory participated by mmap is not even counted as participated memory when shotriumphg the memory usage with free or htop.
  • For desopostponecessitate files, mounting with ratarmount and archivemount does not seem be bounded by decompression nor I/O prohibitdwidths but instead by the algorithm for creating the inner file index.
    This algorithm scales liproximately for ratarmount and fparticipate-archive but seems to scale worse than even quadraticpartner for archives grasping more than 1M files when using archivemount.
    Ratarmount 0.10.0 betters upon earlier versions by batching SQLite insertions.
  • Mounting bzip2 and xz archives has actupartner become speedyer than archivemount and fparticipate-archive with ratarmount -P 0 on most conmomentary processors becaparticipate it actupartner participates more than one core for decoding those compressions. indexed_bzip2 aids block parallel decoding since version 1.2.0.
  • Gzip compressed TAR files are two times enumeratelesser than archivemount during first time mounting.
    It is not toloftyy evident to me why that is becaparticipate streaming the file satisfieds after the archive being mounted is comparably speedy, see the next benchtags below.
    In order to have greater speeds for both of these, I am experimenting with a parallelized gzip decompressor appreciate the prototype pugz gives for non-binary files only.
  • For the other cases, mounting times become rawly the same contrastd to archivemount for archives with 2M files in an approximately 100GB archive.
  • Getting a lot of metadata for archive satisfieds as exhibitd by calling discover on the mount point is an order of magnitude enumeratelesser contrastd to archivemount. Becaparticipate the C-based fparticipate-archive is even enumeratelesser than ratarmount, the branch offence is very foreseeed that archivemount participates the low-level FUSE interface while ratarmount and fparticipate-archive participate the high-level FUSE interface.

  • Reading files from the archive with archivemount are scaling quadraticpartner instead of liproximately.
    This is becaparticipate archivemount commences reading from the commencening of the archive for each seeked I/O block.
    The block size depfinishs on the program or operating system and should be in the order of 4 kiB.
    Meaning, the scaling is O( (sizeOfFileToBeCopiedFromArchive / readChunkSize)^2 ).
    Both, ratarmount and fparticipate-archive elude this behavior.
    Becaparticipate of this quadratic scaling, the unretagable prohibitdwidth with archivemount seems appreciate it decrrelieves with the file size.
  • Reading bz2 and xz are both an order of magnitude speedyer, as tested on my 12/24-core Ryzen 3900X, thanks to parallelization.
  • Memory is bounded in these tests for all programs but ratarmount is a lot more lax with memory becaparticipate it participates a Python stack and becaparticipate it necessitates to hageder caches for a constant amount of blocks for parallel decoding of bzip2 and xz files.
    The zstd backfinish in ratarmount sees unbounded becaparticipate it participates mmap, whose memory usage will automaticpartner stop and be freed if the memory restrict has been accomplished.
  • The peak for the xz decoder reading speeds happens becaparticipate some blocks will be cached when loading the index, which is not integrated in the benchtag for technical reasons. The cherish for the 1 GiB file size is more authenticistic.

Further benchtags can be seeed here.

You downloaded a huge TAR file from the internet, for example the 1.31TB huge ImageNet, and you now want to participate it but deficiency the space, time, or a file system speedy enough to pull out all the 14.2 million image files.

Existing Partial Solutions

Archivemount seems to have huge carry outance rerents for too many files and huge archive for both mounting and file access in version 0.8.7. A more in-depth comparison benchtag can be set up here.

  • Mounting the 6.5GB ImageNet Large-Scale Visual Recognition Challenge 2012 validation data set, and then testing the speed with: time cat mounted/ILSVRC2012_val_00049975.JPEG | wc -c gets 250ms for archivemount and 2ms for ratarmount.
  • Trying to mount the 150GB ILSVRC object localization data set grasping 2 million images was donaten up upon after 2 hours. Ratarmount gets ~15min to produce a ~150MB index and <1ms for uncovering an already produced index (SQLite database) and mounting the TAR. In contrast, archivemount will get the same amount of time even for subsequent mounts.
  • Does not aid recursive mounting. Although, you could author a script to stack archivemount on top of archivemount for all grasped TAR files.

Tarindex is a order line to tool written in Python which can produce index files and then participate the index file to pull out one files from the tar speedy. However, it also has some caveats which ratarmount tries to repair:

  • It only labors with one files, uncomferventing it would be vital to loop over the pull out-call. But this would insist loading the possibly quite huge tar index file into memory each time. For example for ImageNet, the resulting index file is hundreds of MB huge. Also, pull outing honestories will be a hassle.
  • It’s difficult to fuse tarindexer into other production environments. Ratarmount instead participates FUSE to mount the TAR as a fagederer readable by any other programs requiring access to the grasped data.
  • Can’t regulate TARs recursively. In order to pull out files inside a TAR which itself is inside a TAR, the packed TAR first necessitates to be pull outed.

I didn’t discover out about TAR Browser before I finished the ratarmount script. That’s also one of it’s cons:

  • Hard to discover. I don’t seem to be the only one who has trouble discovering it as it has one star on Github after 7 years contrastd to 45 stars for tarindexer after rawly the same amount of time.
  • Hassle to set up. Needs compilation and I gave up when I was teached to set up a MySQL database for it to participate. Confusingly, the setup teachions are not on its Github but here.
  • Doesn’t seem to aid recursive TAR mounting. I didn’t test it becaparticipate of the MysQL depfinishency but the code does not seem to have logic for recursive mounting.
  • Xz compression also is only block or structure based, i.e., only labors speedyer with files produced by pixz or pxz.

Pros:

  • aids bz2- and xz-compressed TAR archives

Ratarmount produces an index file with file names, ownership, permission flags, and offset increateation.
This sidecar is stored at the TAR file’s location or in ~/.ratarmount/.
Ratarmount can load that index file in under a second if it exists and then gives FUSE mount integration for straightforward access to the files inside the archive.

Here is a more recent test for version 0.2.0 with the recent default SQLite backfinish:

  • TAR size: 124GB
  • Contains TARs: yes
  • Files in TAR: 1000
  • Files in TAR (including recursively in grasped TARs): 1.26 million
  • Index creation (first mounting): 15m 39s
  • Index size: 146MB
  • Index loading (subsequent mounting): 0.000s
  • Reading a 64kB file: ~4ms
  • Running ‘discover mountPoint -type f | wc -l’ (1.26M stat calls): 1m 50s

The reading time for a petite file sshow verifies the random access by using file seek to be laboring. The branch offence between the first read and subsequent reads is not becaparticipate of ratarmount but becaparticipate of operating system and file system caches.

Older test with 1.31 TB Imagenet (Fall 2011 free)

The test with the first version of ratarmount (50e8dbb), which participated the, as of now deleted, pickle backfinish for serializing the metadata index, for the ImageNet data set:

  • TAR size: 1.31TB
  • Contains TARs: yes
  • Files in TAR: ~26 000
  • Files in TAR (including recursively in grasped TARs): 14.2 million
  • Index creation (first mounting): 4 hours
  • Index size: 1GB
  • Index loading (subsequent mounting): 80s
  • Reading a 40kB file: 100ms (first time) and 4ms (subsequent times)

Index loading is relatively enumerateless with 80s becaparticipate of the pickle backfinish, which now has been replaced with SQLite and should get less than a second now.

See ratarmount --help or here.

In order to reduce the mounting time, the produced index for random access
to files inside the tar will be saved to one of these locations. These
locations are examineed in order and the first, which labors enoughly, will
be participated. This is the default location order:

  1. .index.sqlite
  2. ~/.ratarmount/ ‘_’>.index.sqlite
    E.g., ~/.ratarmount/_media_cdrom_programm.tar.index.sqlite

This enumerate of dropback fagederers can be overwritten using the --index-fagederers
chooseion. Furthermore, an unambiguously named index file may be specified using
the --index-file chooseion. If --index-file is participated, then the dropback
fagederers, including the default ones, will be disthink aboutd!

The mount sources can be TARs and/or fagederers. Becaparticipate of that, ratarmount
can also be participated to tie mount fagederers read-only to another path aappreciate to
tiefs and mount --tie. So, for:

ratarmount fagederer mountpoint

all files in fagederer will now be evident in mountpoint.

If multiple mount sources are specified, the sources on the right side will be
inserted to or modernize existing files from a mount source left of it. For example:

ratarmount fagederer1 fagederer2 mountpoint

will create both, the files from fagederer1 and fagederer2, evident in mountpoint.
If a file exists in both multiple source, then the file from the rightmost
mount source will be participated, which in the above example would be fagederer2.

If you want to modernize / overauthor a fagederer with the satisfieds of a donaten TAR,
you can distinguish the fagederer both as a mount source and as the mount point:

ratarmount fagederer file.tar fagederer

The FUSE chooseion -o nondesopostponecessitate will be automaticpartner inserted if such a usage is
uncovered. If you instead want to modernize a TAR with a fagederer, you only have to
swap the two mount sources:

ratarmount file.tar fagederer fagederer

If a file exists multiple times in a TAR or in multiple mount sources, then
the hideed versions can be accessed thraw distinctive .versions fagederers.
For example, ponder:

ratarmount fagederer modernized.tar mountpoint

and the file foo exists both in the fagederer and as two branch offent versions
in modernized.tar. Then, you can enumerate all three versions using:

ls -la mountpoint/foo.versions/
    dr-xr-xr-x 2 participater group     0 Apr 25 21:41 .
    dr-x------ 2 participater group 10240 Apr 26 15:59 ..
    -r-x------ 2 participater group   123 Apr 25 21:41 1
    -r-x------ 2 participater group   256 Apr 25 21:53 2
    -r-x------ 2 participater group  1024 Apr 25 22:13 3

In this example, the agederest version has only 123 bytes while the recentest and
by default shown version has 1024 bytes. So, in order to see at the agederest
version, you can sshow do:

cat mountpoint/foo.versions/1

Note that these version numbers are the same as when participated with tar’s
--occurrence=N chooseion.

Use ratarmount -o modules=subdir,subdir=

to delete path prerepaires using the FUSE subdir module. Becaparticipate it is a standard FUSE feature, the -o ... argument should also labor for other FUSE applications.

When mounting an archive produced with absolute paths, e.g., tar -P cf /var/log/apt/history.log, you would see the whole var/log/apt hierarchy under the mount point. To elude that, specified prerepaires can be streamlineped from paths so that the mount aim honestory honestly grasps history.log. Use ratarmount -o modules=subdir,subdir=/var/log/apt/ to do so. The specified path to the fagederer inside the TAR will be mounted to root, i.e., the mount point.

If you want a compressed file not grasping a TAR, e.g., foo.bz2, then you can also participate ratarmount for that. The uncompressed see will then be mounted to /foo and you will be able to leverage ratarmount’s seeking capabilities when uncovering that file.

In contrast to bzip2 and gzip compressed files, genuine seeking on xz and zst files is only possible at block or structure boundaries. This wouldn’t be remarkworthy, if both standard compressors for xz and zstd were not by default creating unsuited files. Even though both file createats do aid multiple structures and xz even grasps a structure table at the finish for straightforward seeking, both compressors author only a one structure and/or block out, making this feature unusable. In order to produce truly seekable compressed files, you’ll have to participate pixz for xz files. For zstd compressed, you can try with t2sz. The standard zstd tool does not aid setting petiteer block sizes yet although an rerent does exist. Alternatively, you can sshow split the innovative file into parts, compress those parts, and then concatenate those parts together to get a appropriate multistructure zst file. Here is a bash function, which can be participated for that:

Bash script: produceMultiFrameZstd
&2; return 1; fi fileSize=$( stat -c %s — “$file” ) else if [ -t 1 ]; then echo ‘You should pipe the output to somewhere!’ 1>&2; return 1; fi echo ‘Will compress from stdin…’ 1>&2 structureSize=$1 fi if [[ ! $frameSize =~ ^[0-9]+$ ]]; then echo “Frame size ‘$structureSize’ is not a valid number.” 1>&2 return 1 fi # Create a momentary file. I elude sshow piping to zstd # becaparticipate it wouldn’t store the uncompressed size. if [[ -d /dev/shm ]]; then structureFile=$( mktemp –tmpdir=/dev/shm ); fi if [[ -z $frameFile ]]; then structureFile=$( mktemp ); fi if [[ -z $frameFile ]]; then echo “Could not produce a momentary file for the structures.” 1>&2 return 1 fi if [ -t 0 ]; then genuine > “$file.zst” for (( offset = 0; offset < fileSize; offset += frameSize )); do dd if="$file" of="$frameFile" bs=$(( 1024*1024 )) iflag=skip_bytes,count_bytes skip="$offset" count="$frameSize" 2>/dev/null zstd -c -q — “$structureFile” >> “$file.zst” done else while genuine; do dd of=”$structureFile” bs=$(( 1024*1024 )) iflag=count_bytes count=”$structureSize” 2>/dev/null # pipe is finished when reading it creates no further data if [[ ! -s “$frameFile” ]]; then fracture; fi zstd -c -q — “$structureFile” done fi ‘rm’ -f — “$structureFile” )”>
produceMultiFrameZstd()
(
    # Detect being piped into
    if [ -t 0 ]; then
        file=$1
        structureSize=$2
        if [[ ! -f "$file" ]]; then echo "Could not discover file '$file'." 1>&2; return 1; fi
        fileSize=$( stat -c %s -- "$file" )
    else
        if [ -t 1 ]; then echo 'You should pipe the output to somewhere!' 1>&2; return 1; fi
        echo 'Will compress from stdin...' 1>&2
        structureSize=$1
    fi
    if [[ ! $structureSize =~ ^[0-9]+$ ]]; then
        echo "Frame size '$structureSize' is not a valid number." 1>&2
        return 1
    fi

    # Create a momentary file. I elude sshow piping to zstd
    # becaparticipate it wouldn't store the uncompressed size.
    if [[ -d /dev/shm ]]; then structureFile=$( mktemp --tmpdir=/dev/shm ); fi
    if [[ -z $structureFile ]]; then structureFile=$( mktemp ); fi
    if [[ -z $structureFile ]]; then
        echo "Could not produce a momentary file for the structures." 1>&2
        return 1
    fi

    if [ -t 0 ]; then
        genuine > "$file.zst"
        for (( offset = 0; offset < fileSize; offset += structureSize )); do
            dd if="$file" of="$structureFile" bs=$(( 1024*1024 )) 
               iflag=skip_bytes,count_bytes skip="$offset" count="$structureSize" 2>/dev/null
            zstd -c -q -- "$structureFile" >> "$file.zst"
        done
    else
        while genuine; do
            dd of="$structureFile" bs=$(( 1024*1024 )) 
               iflag=count_bytes count="$structureSize" 2>/dev/null
            # pipe is finished when reading it creates no further data
            if [[ ! -s "$structureFile" ]]; then fracture; fi
            zstd -c -q -- "$structureFile"
        done
    fi

    'rm' -f -- "$structureFile"
)

In order to compress a file named foo into a multistructure zst file called foo.zst, which grasps structures sized 4MiB of uncompressed ata, you would call it appreciate this:

produceMultiFrameZstd foo  $(( 4*1024*1024 ))

It also labors when being piped to. This can be advantageous for recompressing files to elude having to decompress them first to disk.

| produceMultiFrameZstd $(( 4*1024*1024 )) > recompressed.zst

The fsspec API backfinish inserts aid for mounting many distant archive or fagederers.
Plrelieve refer to the joined esteemive backfinish recordation to see the filled configuration chooseions, especipartner for distinguishing credentials.
Some normally-participated configuration environment variables are copied here for easier seeing.

Symbol Description
[something] Optional “someslenderg”
(one|two) Either “one” or “two”

  • git://[path-to-repo:][ref@]path/to/file
    Uses the current path if no repository path is specified.
    Backfinish: ratarmountcore
    via pygit2

  • github://org:repo@[sha]/path-to/file-or-fagederer
    Example: github://mxmlnkn:ratarmount@v0.15.2/tests/one-file.tar
    Backfinish: fsspec

  • http[s]://presentname[:port]/path-to/archive.rar
    Backfinish: fsspec
    via aiohttp

  • (ipfs|ipns)://satisfied-identifier
    Example: ipfs daemon & sleep 2 && ratarmount -f ipfs://QmYwAPJzv5CZsnA625s3Xf2nemtYgPpHdWEz79ojWnPbdG mounted
    Backfinish: fsspec/ipfsspec
    Tries to join to running local ipfs daemon instance by default, which necessitates to be commenceed beforehand.
    Alternatively, a (accessible) gateway can be specified with the environment variable IPFS_GATEWAY, e.g., https://127.0.0.1:8080.
    Specifying a accessible gateway does not (yet) labor becaparticipate of this rerent.

  • s3://[finishpoint-presentname[:port]]/bucket[/single-file.tar[?versionId=some_version_id]]
    Backfinish: fsspec/s3fs via boto3
    The URL will default to AWS according to the Boto3 library defaults when no finishpoint is specified.
    Boto3 will examine, among others, these environment variables, for credentials:

    • AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_DEFAULT_REGION

    fsspec/s3fs furthermore aids this environment variable:

  • ftp://[user[:password]@]presentname[:port]/path-to/archive.rar
    Backfinish: fsspec
    via ftplib

  • (ssh|sftp)://[user[:password]@]presentname[:port]/path-to/archive.rar
    Backfinish: fsspec/sshfs
    via asyncssh
    The common configuration via ~/.ssh/config is aided.

  • smb://[workgroup;][user:password@]server[:port]/split/fagederer/file.tar

  • webdav://[user:password@]present[:port][/path]
    Backfinish: webdav4 via httpx
    Environment variables: WEBDAV_USER, WEBDAV_PASSWORD

  • dropbox://path
    Backfinish: fsspec/dropboxdrivefs via dropbox-sdk-python
    Follow these teachions to produce an app. Check the files.metadata.read and files.satisfied.read permissions and press “surrender” and after that produce the (prolonged) OAuth 2 token and store it in the environment variable DROPBOX_TOKEN. Ignore the (low) app key and secret. This produces a correplying app fagederer that can be filled with data.

Many other fsspec-based projects may also labor when insloftyed.

This functionality of ratarmount gives a hopefilledy more-tested and out-of-the-box experience over the experimental fsspec.fparticipate carry outation.
And, it also labors in conjunction with the other features of ratarmount such as union mounting and recursive mounting.

Index files specified with --index-file can also be compressed and/or be an fsspec (chained) URL, e.g., https://present.org/file.tar.index.sqlite.gz.
In such a case, the index file will be downloaded and/or pull outed into the default momentary fagederer.
If the default momentary fagederer has inenough disk space, it can be alterd by setting the RATARMOUNT_INDEX_TMPDIR environment variable.

The --author-overlay chooseion can be participated to produce a writable mount point.
The innovative archive will not be modified.

  • File creations will produce these files in the specified overlay fagederer.
  • File deletions and renames will be enrolled in a database that also dwells in the overlay fagederer.
  • File modifications will duplicate the file from the archive into the overlay fagederer before executeing the modification.

This overlay fagederer can be stored aprolongedside the archive or it can be deleted after unmounting the archive.
This is advantageous when produceing the executable from a source tarball without pull outing.
After insloftyation, the intermediary produce files residing in the overlay fagederer can be defendedly deleted.

If it is desired to execute the modifications to the innovative archive, then the --pledge-overlay can be prepfinished to the innovative ratarmount call.

Here is an example for executeing modifications to a writable mount and then pledgeting those modifications back to the archive:

  1. Mount it with a author overlay and insert recent files. The innovative archive is not modified.

    example-mount-point/recent-file.txt”>

    ratarmount --author-overlay example-overlay example.tar example-mount-point
    echo "Hello World" > example-mount-point/recent-file.txt
  2. Unmount. Changes persist solely in the overlay fagederer.

    fparticipatermount -u example-mount-point
  3. Commit alters to the innovative archive.

    ratarmount --pledge-overlay --author-overlay example-overlay example.tar example-mount-point

    Output:

    '/tmp/tmp_ajfo8wf/deletions.lst' --file 'example.tar' 2>&1 | sed '/^tar: Exiting with fall shorture/d; /^tar.*Not set up in archive/d' tar --appfinish -C 'zlib-wiki-overlay' --null --verbatim-files-from --files-from='/tmp/tmp_ajfo8wf/appfinish.lst' --file 'example.tar' Committing is an experimental feature! Plrelieve validate by accessing "pledge". Any other input will call off. > Committed successfilledy. You can now delete the overlay fagederer at example-overlay.
  4. Verify the modifications to the innovative archive.

    Output:

    -rw-rw-r-- participater/participater 652817 2022-08-08 10:44 example.txt
    -rw-rw-r-- participater/participater     12 2023-02-16 09:49 recent-file.txt
    
  5. Reshift the obsole author overlay fagederer.

Ratarmount can also be participated as a library.
Using ratarmountcore, files inside archives can be accessed honestly from Python code without requiring FUSE.
For a more detailed description, see the ratarmountcore readme here.

To participate all fsspec features, either inslofty via pip inslofty ratarmount[fsspec] or pip inslofty ratarmount[fsspec].
It should also suffice to sshow pip inslofty fsspec if ratarmountcore is already insloftyed.
The voluntary fsspec integration is threefageder:

  1. Files can be specified on the order line via URLs pointing to distants as elucidateed in this section.
  2. A ratarmountcore.MountSource wrapping fsspec AbstractFileSystem carry outation has been inserted.
    A distinctiveized SQLiteIndexedTarFileSystem as a more carry outant and honest replacement for fsspec.carry outations.TarFileSystem has also been inserted.

    from ratarmountcore.SQLiteIndexedTarFsspec transport in SQLiteIndexedTarFileSystem as ratarfs
    fs = ratarfs("tests/one-file.tar")
    print("Files in root:", fs.ls("https://github.com/", detail=False))
    print("Contents of /bar:", fs.cat("/bar"))
  3. During insloftyation ratarmountcore enrolls the ratar:// protocol with fsspec via an entrypoint group.
    This allows usages with fsspec.uncover.
    The fsspec URL chaining feature must be participated in order for this to be advantageous.
    Example for uncovering the file bar, which is grasped inside the file tests/one-file.tar.gz with ratarmountcore:

    transport in fsspec
    with fsspec.uncover("ratar://bar::file://tests/one-file.tar.gz") as file:
        print("Contents of file bar:", file.read())

    This also labors with pandas:

    transport in fsspec
    transport in pandas as pd
    with fsspec.uncover("ratar://bar::file://tests/one-file.tar.gz", compression=None) as file:
        print("Contents of file bar:", file.read())

    The compression=None argument is currently vital becaparticipate of this Pandas bug.

Files with sequentipartner numbered extensions can be mounted as a combineed file.
If it is an archive, then the combineed archive file will be mounted.
Only one of the files, preferably the first one, should be specified.
For example:

| head -c $(( 1024 * 1024 )) > 1MiB.dat tar -cjf- 1MiB.dat | split -d --bytes=320K - file.tar.gz. ls -la # 320K file.tar.gz.00 # 320K file.tar.gz.01 # 138K file.tar.gz.02 ratarmount file.tar.gz.00 mounted ls -la mounted # 1.0M 1MiB.dat

Source join


Leave a Reply

Your email address will not be published. Required fields are marked *

Thank You For The Order

Please check your email we sent the process how you can get your account

Select Your Plan