Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Size_t : Make large files work #2166

Closed
wants to merge 1,281 commits into from

Conversation

PhilipOakley
Copy link

SUCCESS. It's still rough, but it's working.

Just wanted to get it out there.

This is on top of shears/pu and t-b's big size_t patch.

Signed-off-by: Philip Oakley [email protected]

phili@Philip-Win10 MINGW64 /usr/src/git/t (size_t5)
$ ./t-large-files-on-windows.sh -i -v -x -d
Initialized empty Git repository in C:/git-sdk-64/usr/src/git/t/trash directory.t-large-files-on-windows/.git/
checking prerequisite: SIZE_T_IS_64BIT

mkdir -p "$TRASH_DIRECTORY/prereq-test-dir" &&
(
        cd "$TRASH_DIRECTORY/prereq-test-dir" &&
        test 8 -le "$(build_option sizeof-size_t)"

)
++ mkdir -p '/usr/src/git/t/trash directory.t-large-files-on-windows/prereq-test-dir'
++ cd '/usr/src/git/t/trash directory.t-large-files-on-windows/prereq-test-dir'
+++ build_option sizeof-size_t
+++ git version --build-options
+++ sed -ne 's/^sizeof-size_t: //p'
++ test 8 -le 8
prerequisite SIZE_T_IS_64BIT ok
expecting success:

        test-tool zlib-compile-flags >zlibFlags.txt &&
        dd if=/dev/zero of=file bs=1M count=4100 &&
        git config core.compression 0 &&
        git config core.looseCompression 0 &&
        git add file &&
        git verify-pack -s .git/objects/pack/*.pack &&
        git fsck --verbose --strict --full &&
        git commit -m msg file &&
        git log --stat &&
        git gc &&
        git fsck --verbose --strict --full &&
        git index-pack -v -o test.idx .git/objects/pack/*.pack &&
        git gc &&
        git fsck

++ test-tool zlib-compile-flags
++ dd if=/dev/zero of=file bs=1M count=4100
4100+0 records in
4100+0 records out
4299161600 bytes (4.3 GB, 4.0 GiB) copied, 6.89465 s, 624 MB/s
++ git config core.compression 0
++ git config core.looseCompression 0
++ git add file
++ git verify-pack -s .git/objects/pack/pack-88fec382e989b72b87fdd0ecc88bb70529b3dd0b.pack
non delta: 1 object
++ git fsck --verbose --strict --full
Checking object directory
Checking blob 754a93d6fada4c6873360e6cb4b209132271ab0e
Checking HEAD link
notice: HEAD points to an unborn branch (master)
notice: No default references
Checking connectivity (32 objects)
Checking 754a93d6fada4c6873360e6cb4b209132271ab0e
++ git commit -m msg file
[master (root-commit) 1114911] msg
 Author: A U Thor <[email protected]>
 1 file changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 file
++ git log --stat
commit 11149116b3df24b582e545deb18b16fe1492a65d (HEAD -> master)
Author: A U Thor <[email protected]>
Date:   Mon Apr 22 22:39:33 2019 +0000

    msg

 file | Bin 0 -> 4194304 bytes
 1 file changed, 0 insertions(+), 0 deletions(-)
++ git gc
Enumerating objects: 3, done.
Counting objects: 100% (3/3), done.
Writing objects: 100% (3/3), done.
Total 3 (delta 0), reused 1 (delta 0)
++ git fsck --verbose --strict --full
Checking object directory
Checking commit 11149116b3df24b582e545deb18b16fe1492a65d
Checking blob 754a93d6fada4c6873360e6cb4b209132271ab0e
Checking tree 1a9c7e21bb658a2600dc0c191b7efb654106f754
Checking HEAD link
Checking reflog 0000000000000000000000000000000000000000->11149116b3df24b582e545deb18b16fe1492a65d
Checking reflog 0000000000000000000000000000000000000000->11149116b3df24b582e545deb18b16fe1492a65d
Checking cache tree
Checking connectivity (32 objects)
Checking 11149116b3df24b582e545deb18b16fe1492a65d
Checking 754a93d6fada4c6873360e6cb4b209132271ab0e
Checking 1a9c7e21bb658a2600dc0c191b7efb654106f754
++ git index-pack -v -o test.idx .git/objects/pack/pack-6c993b46fa70cbd847fa02eb7b618f6cbb576b2d.pack
Indexing objects: 100% (3/3), done.
6c993b46fa70cbd847fa02eb7b618f6cbb576b2d
++ git gc
Enumerating objects: 3, done.
Counting objects: 100% (3/3), done.
Writing objects: 100% (3/3), done.
Total 3 (delta 0), reused 3 (delta 0)
++ git fsck
Checking object directories: 100% (256/256), done.
Checking objects: 100% (3/3), done.
ok 1 - require 64bit size_t

# passed all 1 test(s)
1..1

phili@Philip-Win10 MINGW64 /usr/src/git/t (size_t5)
$

dscho and others added 30 commits April 19, 2019 08:31
Between the libgit2 and the Git for Windows project, there has been a
discussion how we could share Git configuration to avoid duplication (or
worse: skew).

Earlier, libgit2 was nice enough to just re-use Git for Windows'

	C:\Program Files (x86)\Git\etc\gitconfig

but with the upcoming Git for Windows 2.x, there would be more paths to
search, as we will have 64-bit and 32-bit versions, and the
corresponding config files will be in %PROGRAMFILES%\Git\mingw64\etc and
...\mingw32\etc, respectively.

Worse: there are portable Git for Windows versions out there which live
in totally unrelated directories, still.

Therefore we came to a consensus to use `%PROGRAMDATA%\Git\config` as the
location for shared Git settings that are of wider interest than just Git
for Windows.

Of course, the configuration in `%PROGRAMDATA%\Git\config` has the
widest reach, therefore it must take the lowest precedence, i.e. Git for
Windows can still override settings in its `etc/gitconfig` file.

Helped-by: Andreas Heiduk <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
For many Win32 functions, there actually exist two variants: one with
the `A` suffix that takes ANSI parameters (`char *` or `const char *`)
and one with the `W` suffix that takes Unicode parameters (`wchar_t *`
or `const wchar_t *`).

Let's be precise what we want to use.

Signed-off-by: Johannes Schindelin <[email protected]>
Fix t0001 when the current working directory differs in case from the canonical form
…fallback

[Outreachy] Removed ipv6 fallback
Assumes file names in git tree objects are UTF-8 encoded.

On most unix systems, the system encoding (and thus the TCL system
encoding) will be UTF-8, so file names will be displayed correctly.

On Windows, it is impossible to set the system encoding to UTF-8.
Changing the TCL system encoding (via 'encoding system ...', e.g. in the
startup code) is explicitly discouraged by the TCL docs.

Change gitk functions dealing with file names to always convert
from and to UTF-8.

Signed-off-by: Karsten Blees <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
This reverts commit a9fa11f.

Signed-off-by: Heiko Voigt <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
We will use them in the upcoming "FSCache" patches (to accelerate
sequential lstat() calls).

Signed-off-by: Karsten Blees <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
Add a helper function to start GDB that was already attached to the current process
On Windows, there are dramatic problems when a command line grows
beyond PATH_MAX, which is restricted to 8191 characters on XP and
later (according to http://support.microsoft.com/kb/830473).

Work around this by just cutting off the command line at that length
(actually, at a space boundary) in the hope that only negative
refs are chucked: gitk will then do unnecessary work, but that is
still better than flashing the gitk window and exiting with exit
status 5 (which no Windows user is able to make sense of).

The first fix caused Tcl to fail to compile the regexp, see msysGit issue
427. Here is another fix without using regexp, and using a more relaxed
command line length limit to fix the original issue 387.

Signed-off-by: Sebastian Schuberth <[email protected]>
Signed-off-by: Pat Thoyts <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
Make use of the new environment variable GIT_ASK_YESNO to support the
recently implemented fallback in case unlink, rename or rmdir fail for
files in use on Windows. The added dialog will present a yes/no question
to the the user which will currently be used by the windows compat layer
to let the user retry a failed file operation.

Signed-off-by: Heiko Voigt <[email protected]>
Git for Windows 2.x ships with an executable that starts the Git Bash
with all the environment variables and what not properly set up. It is
also adjusted according to the Terminal emulator option chosen when
installing Git for Windows (while `bash.exe --login -i` would always
launch with Windows' default console).

So let's use that executable (usually C:\Program Files\Git\git-bash.exe)
instead of `bash.exe --login -i` if its presence was detected.

This fixes git-for-windows#490

Signed-off-by: Thomas Kläger <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
Move opendir down in preparation for the next patch.

Signed-off-by: Karsten Blees <[email protected]>
This topic branch avoids spawning `gzip` when asking `git archive` to
create `.tar.gz` files.

Signed-off-by: Johannes Schindelin <[email protected]>
Signed-off-by: Chris West (Faux) <[email protected]>
Signed-off-by: Sebastian Schuberth <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
The text wrapping seems to be aligned to the right side of the Yes
button, leaving an awful lot of empty space.

Let's try to counter this by using pixel units.

Signed-off-by: Johannes Schindelin <[email protected]>
Since v2.9.0, Git knows about the config variable core.hookspath
that allows overriding the path to the directory containing the
Git hooks.

Since v2.10.0, the `--git-path` option respects that config
variable, too, so we may just as well use that command.

For Git versions older than v2.5.0 (which was the first version to
support the `--git-path` option for the `rev-parse` command), we
simply fall back to the previous code.

This fixes git-for-windows#1755

Initial-patch-by: Philipp Gortan <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
This topic branch addresses the bug where Git for Windows 2.x' Git GUI
failed to generate a working shortcut via Repository>Create Desktop
Shortcut.

Signed-off-by: Johannes Schindelin <[email protected]>
Emulating the POSIX dirent API on Windows via FindFirstFile/FindNextFile is
pretty staightforward, however, most of the information provided in the
WIN32_FIND_DATA structure is thrown away in the process. A more
sophisticated implementation may cache this data, e.g. for later reuse in
calls to lstat.

Make the dirent implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.

Define a base DIR structure with pointers to readdir/closedir that match
the opendir implementation (i.e. similar to vtable pointers in OOP).
Define readdir/closedir so that they call the function pointers in the DIR
structure. This allows to choose the opendir implementation on a
call-by-call basis.

Move the fixed sized dirent.d_name buffer to the dirent-specific DIR
structure, as d_name may be implementation specific (e.g. a caching
implementation may just set d_name to point into the cache instead of
copying the entire file name string).

Signed-off-by: Karsten Blees <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
Git for Windows now ships with the new Git icon from git-scm.com. Use that
icon file if it exists instead of the old procedurally drawn one.

This patch was sent upstream but so far no decision on its inclusion was
made, so commit it to our fork.

Signed-off-by: Sebastian Schuberth <[email protected]>
"Question?" is maybe not the most informative thing to ask. In the
absence of better information, it is the best we can do, of course.

However, Git for Windows' auto updater just learned the trick to use
git-gui--askyesno to ask the user whether to update now or not. And in
this scripted scenario, we can easily pass a command-line option to
change the window title.

So let's support that with the new `--title <title>` option.

Signed-off-by: Johannes Schindelin <[email protected]>
git-gui tries to temporary set GIT_DIR for starting gitk and restore
it back after they are started. But in case of GIT_DIR which was not set
prior to invocation it is not unset after it. This affects commands
which can be later started from that git gui, for example "Git Bash".

Fix it.

Signed-off-by: Max Kirillov <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
Let's try to address git-for-windows#1755 this way.

Signed-off-by: Johannes Schindelin <[email protected]>
Emulating the POSIX lstat API on Windows via GetFileAttributes[Ex] is quite
slow. Windows operating system APIs seem to be much better at scanning the
status of entire directories than checking single files. A caching
implementation may improve performance by bulk-reading entire directories
or reusing data obtained via opendir / readdir.

Make the lstat implementation pluggable so that it can be switched at
runtime, e.g. based on a config option.

Signed-off-by: Karsten Blees <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
Signed-off-by: Johannes Schindelin <[email protected]>
Tcl/Tk 8.6 introduced new events for the cursor left/right keys and
apparently changed the behavior of the previous event.

Let's work around that by using the new events when we are running with
Tcl/Tk 8.6 or later.

This fixes git-for-windows#495

Signed-off-by: Johannes Schindelin <[email protected]>
For additional GUI goodness.

Signed-off-by: Johannes Schindelin <[email protected]>
git-gui: correctly restore GIT_DIR after invoking commands
Philip Oakley added 25 commits April 21, 2019 16:13
On Windows long may only be 32 bits and their use for pointer sized
comparison is potentially implemenation defined.

Ensure they are up-cast to size_t

Signed-off-by: Philip Oakley <[email protected]>
On Windows, computation may be performed as 32bit long rather than size_t.
Ensure all arguments are size_t to avoid implemenation dependent implicit
 conversion.

Signed-off-by: Philip Oakley <[email protected]>
The 'size' is 32 bits on Windows, but 64 bits on Linux and needs careful
selection a-priori of NO_DEFLATE_BOUND

Signed-off-by: Philip Oakley <[email protected]>
On Windows uLong and size_t are different, being 32bit and 64bit
respectively. Computations of mixed 32/64 bit types can be
implementation defined leading to potential accuracy loss and error.

Avoid wraparound of z.total_in and z.total_in by always
starting at zero. The chunk size is kept well within 32bit limits.

Ensure the z.total_in and z.total_in are _upcast_ when computing the
overall avail_in and avail_out values

Signed-off-by: Philip Oakley <[email protected]>
Tell the developer which condition failed.

Signed-off-by: Philip Oakley <[email protected]>
On Windows, uInt/uLong are only 32 bits.

Signed-off-by: Philip Oakley <[email protected]>
Zero length initialisations are not converted.

Signed-off-by: Philip Oakley <[email protected]>
(also updated packfile.h)

Signed-off-by: Philip Oakley <[email protected]>
verify the pack at earliest opportunity

Add the extra fsck to get diagnostics after the add.

it's -v (verbose) not --verify

Slight confusion as to why index-pack vs verify-pack...

Signed-off-by: Philip Oakley <[email protected]>
This reverts commit 30beb16.

The new zlib code copes properly and uses this *internally*

Signed-off-by: Philip Oakley <[email protected]>
The code base now uses size_t for all memsized variables.
Allow shift to reach that bitness level.

Signed-off-by: Philip Oakley <[email protected]>
Also use appropriate format for printing.

Signed-off-by: Philip Oakley <[email protected]>
use size_t for Windows compatibility.

Signed-off-by: Philip Oakley <[email protected]>
For Windows compatibility

Signed-off-by: Philip Oakley <[email protected]>
Doh. I'd fixed everything else. Just this to prove!

Signed-off-by: Philip Oakley <[email protected]>
Otherwise it clashes with the existing index file.
Check that the sha1 matches the existing value;-)

Signed-off-by: Philip Oakley <[email protected]>
@PhilipOakley PhilipOakley requested a review from dscho April 22, 2019 22:54
@tboegi
Copy link

tboegi commented Apr 23, 2019

That's good news.
Should we use git-for-windows as the base-line ?

@dscho
Copy link
Member

dscho commented Jun 4, 2019

@PhilipOakley maybe we'll close this PR in favor of #2179?

@PhilipOakley
Copy link
Author

Yes. Let's consolidate on #2179.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.