-
-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Full build with PDF is taking more than 24h #169
Comments
Building only the changed branches and languages, how does it sound? I don't have much information about the server but here are some ideas:
|
Additional question for @JulienPalard: Are non-bugfix and non-stable releases only rebuilt manually or is there a cron? |
|
What do you think about the idea in general @JulienPalard? |
(see the git_clone function in the script).
Maybe, yes. But to do it cleanly we should only rebuild if the Currently a git log shows that (for the Doc/ directory) changes are:
A change cpython side should trigger a rebuild of all languages for the given branch (while a change for a language should not trigger a rebuild for all branches: translation repo has a translation branch per cpython branch). Translation side I expect less changes, except for a few crons like for python-docs-ja which synchronizes daily. Also changes are for a single branch so they invalidate a single cpython branch. So instead of rebuilding 78 docs per day, just looking at the cpython side, we'd rebuild like 26 docs daily (main and 3.11) which should take like 9h. About how to do this, we could store at build time the cpython commit sha and the translation commit sha in a file, and at build time call a dedicated function to check the shas against the repositories to see if a build is needed or not. Seems doable. |
@JulienPalard Someone opened an issue on Sphinx in order to know whether we could speed-up things on our side. I am not quite confident in that since most of the build time is actually the PDF build. It is possible to know why Another possibility is to change the way the PDF are created. We (Sphinx) could technically add a builder which, instead of outputting LaTeX code, outputs some other kind of typesetting language that can still be converted to PDF faster than LaTeX. |
Hi @picnixz, thanks for jumping in!
I'm just a user of LaTeX, I don't know much. I know our biggest file is
Those timings can be reproduced on a cpython clone after building latex file:
It reminds me of something about running it in a loop until the output does not changes..., which strace confirms:
Which also can be confirmed by reading the console log (I choose a smaller file to get readable output):
Also noticed, from
this is probably less than ideal. I tried to use
it looks like this:
Probably yes,
Looking at the server logs, yes
It would be possible to build PDF using weasyprint. |
It looks like a potential alternative for PDFs building is using Rinohtype. It provides a drop-in replacement for Sphinx PDF builder and is a direct PDF builder. Still in beta though. |
Thank you very much for your report ! Now I'm a bit confused about these timings:
Maybe I misunderstood, but why does the second command takes 4 min 13s but you told me it took 1min 16s? I also think the last command should be
Yes, it's because of how LaTeX is structured with this token based approach + expansions. Depending on how expansions are done, it may enormously slow down the build. I am wondering whether breaking down In conclusion, I don't think we could do much except:
By the way, how much is allocated to build the documentation on Python servers? because a straightforward solution is simply to allocate more resources and run everything in parallel (but again, funds are required and I don't know if the server is like a super-mega-huge-powerful server). In the end, if you need to rebuild 26 docs per day (assuming you can reduce from 70+), the you "simply" need a more powerful unit. |
That's because I really don't know if there's a way to forge a latex file that necessitates less re-runs.
Probably just me manually fixing the commands for readability (and breaking them while doing so, haha). |
Ah sorry! yes I overlooked that (I was a bit confused actually because I assumed that the ~1min was the output of Also, since it tells us "or use package Btw, I'm sorry but I cannot really reproduce it myself because I need to install fonts that I don't have (and move them around files + adding paths or so) (but since you've got everything running on your side + timings are for your machine, the comparison is more fair). |
IIRC we use xetex instead of pdflatex for its unicode awareness. Tried:
|
Since #171 has been merged, there has been a build taking 13.6 hours. That's better, probably still room for enhancements. |
I changed the build cron so it starts hourly, so instead of doing nothing for 24h-13.6h the script checks if there's something to build. In the logs here's what I see:
|
The last https://docs.python.org/3/ build was at Dec 07, 2023 (12:24 UTC). The 3.12.1 release was at something like Dec 08, 2023 (00:45 UTC). As there's no public logs yet (#174), I'm curious why there's been no build yet from the hourly cron. Is the queue full, or maybe something else up? This relates to https://discuss.python.org/t/python-3-12-1-now-available/40603/2?u=hugovk: a request for release announcements to include the changelog, but the changelog at https://docs.python.org/3/whatsnew/changelog.html still shows "Next" and not "3.12.1". |
@hugovk @JulienPalard What are the next steps needed for this? |
I see the last build on |
It now says If it took 27.5 hours, what language/versions has it been building to delay it? There have been changes to 3.11-3.13 in the past day, but a week or more for 3.10 and older. |
Since we stopped rebuilding when it's not needed, we're maintaining a file with some infos: see file
it's not that readable but may help a bit. I can paste some logs here: view filemdk@docs:~$ zgrep 'Build start' /var/log/docsbuild/docsbuild.log.6.gz
2023-12-07 01:03:31,783 INFO it/3.11: Build start.
2023-12-07 01:34:48,312 INFO id/3.11: Build start.
2023-12-07 02:13:16,175 INFO fr/3.11: Build start.
2023-12-07 02:47:08,767 INFO es/3.11: Build start.
2023-12-07 03:24:20,181 INFO en/3.11: Build start.
2023-12-07 03:53:32,721 INFO zh-tw/3.12: Build start.
2023-12-07 04:54:28,702 INFO zh-cn/3.12: Build start.
2023-12-07 05:51:22,955 INFO uk/3.12: Build start.
2023-12-07 05:55:15,983 INFO tr/3.12: Build start.
2023-12-07 06:29:04,896 INFO pt-br/3.12: Build start.
2023-12-07 07:03:29,075 INFO pl/3.12: Build start.
2023-12-07 07:33:25,766 INFO ko/3.12: Build start.
2023-12-07 08:16:09,363 INFO ja/3.12: Build start.
2023-12-07 09:59:31,894 INFO it/3.12: Build start.
2023-12-07 10:29:39,325 INFO id/3.12: Build start.
2023-12-07 11:10:44,153 INFO fr/3.12: Build start.
2023-12-07 11:45:54,247 INFO es/3.12: Build start.
2023-12-07 12:24:23,271 INFO en/3.12: Build start. <-- the build you've seen in prod while writing your message
2023-12-07 12:55:57,594 INFO zh-tw/3.13: Build start.
2023-12-07 14:58:05,206 INFO zh-cn/3.13: Build start.
2023-12-07 16:59:55,851 INFO uk/3.13: Build start.
2023-12-07 17:06:09,502 INFO tr/3.13: Build start.
2023-12-07 17:53:58,847 INFO pt-br/3.13: Build start.
2023-12-07 18:40:23,044 INFO pl/3.13: Build start.
2023-12-07 19:22:33,254 INFO ko/3.13: Build start.
2023-12-07 20:17:45,832 INFO ja/3.13: Build start.
2023-12-07 22:24:23,323 INFO it/3.13: Build start.
2023-12-07 23:03:59,538 INFO id/3.13: Build start.
2023-12-07 23:50:30,074 INFO fr/3.13: Build start.
mdk@docs:~$ zgrep 'Build start' /var/log/docsbuild/docsbuild.log.5.gz
2023-12-08 00:29:10,617 INFO es/3.13: Build start.
<-- 3.12.1 release
2023-12-08 01:10:12,533 INFO en/3.13: Build start.
2023-12-08 02:07:14,771 INFO zh-tw/3.11: Build start.
2023-12-08 03:02:35,382 INFO zh-cn/3.11: Build start.
2023-12-08 03:55:25,481 INFO uk/3.11: Build start.
2023-12-08 03:58:58,984 INFO tr/3.11: Build start.
2023-12-08 04:30:39,399 INFO pt-br/3.11: Build start.
2023-12-08 05:02:31,596 INFO pl/3.11: Build start.
2023-12-08 05:31:15,058 INFO ko/3.11: Build start.
2023-12-08 06:12:14,383 INFO ja/3.11: Build start.
2023-12-08 07:51:32,897 INFO it/3.11: Build start.
2023-12-08 08:19:59,311 INFO id/3.11: Build start.
<-- your message (more or less 1h)
2023-12-08 08:55:57,314 INFO fr/3.11: Build start.
2023-12-08 09:26:26,468 INFO es/3.11: Build start.
2023-12-08 10:01:36,727 INFO en/3.11: Build start.
2023-12-08 10:29:12,557 INFO zh-tw/3.12: Build start.
2023-12-08 11:28:08,345 INFO zh-cn/3.12: Build start.
2023-12-08 12:28:27,897 INFO uk/3.12: Build start.
2023-12-08 12:33:01,923 INFO tr/3.12: Build start.
2023-12-08 13:09:30,960 INFO pt-br/3.12: Build start.
2023-12-08 13:45:39,082 INFO pl/3.12: Build start.
2023-12-08 14:17:58,792 INFO ko/3.12: Build start.
2023-12-08 15:02:13,565 INFO ja/3.12: Build start.
2023-12-08 16:49:28,995 INFO it/3.12: Build start.
2023-12-08 17:24:15,640 INFO id/3.12: Build start.
2023-12-08 18:03:34,328 INFO fr/3.12: Build start.
2023-12-08 18:35:41,454 INFO es/3.12: Build start.
2023-12-08 19:11:52,848 INFO en/3.12: Build start. <--
2023-12-08 19:41:12,356 INFO zh-tw/3.13: Build start.
2023-12-08 21:28:18,628 INFO zh-cn/3.13: Build start.
2023-12-08 23:08:18,807 INFO uk/3.13: Build start.
2023-12-08 23:12:37,137 INFO tr/3.13: Build start.
2023-12-08 23:48:46,069 INFO pt-br/3.13: Build start.
mdk@docs:~$ zgrep 'Build start' /var/log/docsbuild/docsbuild.log.4.gz
2023-12-09 00:25:10,138 INFO pl/3.13: Build start.
2023-12-09 00:57:45,725 INFO ko/3.13: Build start.
2023-12-09 01:40:28,952 INFO ja/3.13: Build start.
2023-12-09 03:21:11,244 INFO it/3.13: Build start.
2023-12-09 03:53:36,108 INFO id/3.13: Build start.
2023-12-09 04:35:39,054 INFO fr/3.13: Build start.
2023-12-09 05:08:42,237 INFO es/3.13: Build start.
2023-12-09 05:44:38,758 INFO en/3.13: Build start.
2023-12-09 07:07:18,041 INFO zh-tw/3.11: Build start.
2023-12-09 07:58:36,484 INFO zh-cn/3.11: Build start.
2023-12-09 08:50:50,170 INFO uk/3.11: Build start.
2023-12-09 08:54:18,539 INFO tr/3.11: Build start.
2023-12-09 09:25:13,308 INFO pt-br/3.11: Build start.
2023-12-09 09:55:38,752 INFO pl/3.11: Build start.
2023-12-09 10:24:47,533 INFO ko/3.11: Build start.
2023-12-09 11:05:06,125 INFO ja/3.11: Build start.
2023-12-09 12:47:05,609 INFO it/3.11: Build start.
2023-12-09 13:17:31,303 INFO id/3.11: Build start.
2023-12-09 13:55:56,164 INFO fr/3.11: Build start.
2023-12-09 14:30:28,217 INFO es/3.11: Build start.
2023-12-09 15:10:10,126 INFO en/3.11: Build start.
2023-12-09 15:40:26,584 INFO zh-tw/3.12: Build start.
2023-12-09 16:42:34,264 INFO zh-cn/3.12: Build start.
2023-12-09 17:43:02,356 INFO uk/3.12: Build start.
2023-12-09 17:47:08,059 INFO tr/3.12: Build start.
2023-12-09 18:21:25,707 INFO pt-br/3.12: Build start.
2023-12-09 18:55:27,545 INFO pl/3.12: Build start.
2023-12-09 19:26:25,268 INFO ko/3.12: Build start.
2023-12-09 20:08:18,771 INFO ja/3.12: Build start.
2023-12-09 21:51:05,595 INFO it/3.12: Build start.
2023-12-09 22:23:04,710 INFO id/3.12: Build start.
2023-12-09 23:03:02,551 INFO fr/3.12: Build start.
2023-12-09 23:36:46,483 INFO es/3.12: Build start.
mdk@docs:~$ zgrep 'Build start' /var/log/docsbuild/docsbuild.log.3.gz
2023-12-10 00:12:31,758 INFO en/3.12: Build start. <--
2023-12-10 00:38:36,561 INFO zh-tw/3.13: Build start.
2023-12-10 02:11:45,282 INFO zh-cn/3.13: Build start.
2023-12-10 03:40:09,197 INFO uk/3.13: Build start.
2023-12-10 03:43:59,909 INFO tr/3.13: Build start.
2023-12-10 04:16:53,655 INFO pt-br/3.13: Build start.
2023-12-10 04:50:19,080 INFO pl/3.13: Build start.
2023-12-10 05:21:18,370 INFO ko/3.13: Build start.
2023-12-10 05:59:52,249 INFO ja/3.13: Build start.
2023-12-10 07:39:20,008 INFO it/3.13: Build start.
2023-12-10 08:12:51,583 INFO id/3.13: Build start.
2023-12-10 08:56:53,066 INFO fr/3.13: Build start.
2023-12-10 09:32:21,252 INFO es/3.13: Build start.
2023-12-10 10:10:41,943 INFO en/3.13: Build start.
2023-12-10 11:07:17,607 INFO zh-tw/3.11: Build start.
2023-12-10 12:01:43,511 INFO zh-cn/3.11: Build start.
2023-12-10 12:55:46,836 INFO uk/3.11: Build start.
2023-12-10 12:59:14,477 INFO tr/3.11: Build start.
2023-12-10 13:31:47,797 INFO pt-br/3.11: Build start.
2023-12-10 14:04:52,957 INFO pl/3.11: Build start.
2023-12-10 14:35:04,605 INFO ko/3.11: Build start.
2023-12-10 15:18:22,877 INFO ja/3.11: Build start.
2023-12-10 17:06:49,408 INFO it/3.11: Build start.
2023-12-10 17:40:07,017 INFO id/3.11: Build start.
2023-12-10 18:21:59,182 INFO fr/3.11: Build start.
2023-12-10 18:57:50,425 INFO es/3.11: Build start.
2023-12-10 19:38:27,999 INFO en/3.11: Build start.
2023-12-10 20:09:49,259 INFO zh-tw/3.12: Build start.
2023-12-10 21:13:16,417 INFO zh-cn/3.12: Build start.
2023-12-10 22:14:28,313 INFO uk/3.12: Build start.
2023-12-10 22:18:52,271 INFO tr/3.12: Build start.
2023-12-10 22:52:49,215 INFO pt-br/3.12: Build start.
2023-12-10 23:26:13,913 INFO pl/3.12: Build start.
2023-12-10 23:55:45,872 INFO ko/3.12: Build start.
mdk@docs:~$ zgrep 'Build start' /var/log/docsbuild/docsbuild.log.2.gz
2023-12-11 00:38:06,339 INFO ja/3.12: Build start.
2023-12-11 02:19:10,223 INFO it/3.12: Build start.
2023-12-11 02:48:39,326 INFO id/3.12: Build start.
2023-12-11 03:25:56,754 INFO fr/3.12: Build start.
2023-12-11 03:57:07,365 INFO es/3.12: Build start.
2023-12-11 04:33:46,003 INFO en/3.12: Build start. <--
2023-12-11 05:02:28,158 INFO zh-tw/3.13: Build start.
2023-12-11 06:40:57,160 INFO zh-cn/3.13: Build start.
2023-12-11 08:20:40,118 INFO uk/3.13: Build start.
2023-12-11 08:25:12,725 INFO tr/3.13: Build start.
2023-12-11 09:00:10,942 INFO pt-br/3.13: Build start.
2023-12-11 09:31:30,328 INFO pl/3.13: Build start.
2023-12-11 10:00:34,948 INFO ko/3.13: Build start.
2023-12-11 10:38:35,804 INFO ja/3.13: Build start.
2023-12-11 12:15:57,521 INFO it/3.13: Build start.
2023-12-11 12:46:40,636 INFO id/3.13: Build start.
2023-12-11 13:26:24,106 INFO fr/3.13: Build start.
2023-12-11 13:59:58,381 INFO es/3.13: Build start.
2023-12-11 14:39:16,552 INFO en/3.13: Build start.
2023-12-11 16:07:20,776 INFO zh-tw/3.11: Build start.
2023-12-11 17:04:35,431 INFO zh-cn/3.11: Build start.
2023-12-11 18:01:09,964 INFO uk/3.11: Build start.
2023-12-11 18:05:16,722 INFO tr/3.11: Build start.
2023-12-11 18:38:16,662 INFO pt-br/3.11: Build start.
2023-12-11 19:11:47,677 INFO pl/3.11: Build start.
2023-12-11 19:43:07,048 INFO ko/3.11: Build start.
2023-12-11 20:27:02,135 INFO ja/3.11: Build start.
2023-12-11 22:05:54,678 INFO it/3.11: Build start.
2023-12-11 22:33:47,162 INFO id/3.11: Build start.
2023-12-11 23:09:11,730 INFO fr/3.11: Build start.
2023-12-11 23:42:10,783 INFO es/3.11: Build start. so yes, there's still almost 24h between any builds. In 14 days we've seen only 45 build being skipped vs 449 rebuilds being needed:
is is partly due to some translation repos having a cron commiting every day (like transifex pulls) even if there's no new translations in the files. But the root cause of slow build time is probably more #169 (comment) (latexmk running xelatex in a loop until the result stop changing). @ewdurbin what "CPU" do we have on the machine? I don't see usefull info on |
How important are the latex builds? We may want to look at additional ways of optimizing the latex builds. https://blog.martisak.se/2023/10/01/compiling/ |
PDF builds are based on LaTeX builds. When PDFs are unavailable we have user complaints (typically on docs@). |
Yeah, we definitely need PDFs. I wonder if it makes sense to try building PDF with an alternate tool, |
Shotscraper could be another approach: https://shot-scraper.datasette.io/en/stable/pdf.html |
PDF shoud provide table of contents and index. But I don't know good tool to convert ePub to PDF. |
mdk@docs:~$ apt-cache policy latexmk
latexmk:
Installed: 1:4.67-0.1
Candidate: 1:4.67-0.1
Version table:
*** 1:4.67-0.1 500
500 http://us.archive.ubuntu.com/ubuntu focal/universe amd64 Packages
100 /var/lib/dpkg/status
mdk@docs:~$ apt-cache policy texlive-xetex
texlive-xetex:
Installed: 2019.20200218-1
Candidate: 2019.20200218-1
Version table:
*** 2019.20200218-1 500
500 http://us.archive.ubuntu.com/ubuntu focal/universe amd64 Packages
100 /var/lib/dpkg/status
|
The new Ubuntu 24.04 server (python/psf-salt#421) has: hugovk@docs:~$ apt-cache policy latexmk
latexmk:
Installed: 1:4.83-1
Candidate: 1:4.83-1
Version table:
*** 1:4.83-1 500
500 http://mirrors.digitalocean.com/ubuntu noble/universe amd64 Packages
100 /var/lib/dpkg/status
hugovk@docs:~$ latexmk --version
Latexmk, John Collins, 31 Jan. 2024. Version 4.83 hugovk@docs:~$ apt-cache policy texlive-xetex
texlive-xetex:
Installed: 2023.20240207-1
Candidate: 2023.20240207-1
Version table:
*** 2023.20240207-1 500
500 http://mirrors.digitalocean.com/ubuntu noble/universe amd64 Packages
100 /var/lib/dpkg/status
hugovk@docs:~$ xetex --version
XeTeX 3.141592653-2.6-0.999995 (TeX Live 2023/Debian)
kpathsea version 6.3.5
Copyright 2023 SIL International, Jonathan Kew and Khaled Hosny.
There is NO warranty. Redistribution of this software is
covered by the terms of both the XeTeX copyright and
the Lesser GNU General Public License.
For more information about these matters, see the file
named COPYING and the XeTeX source.
Primary author of XeTeX: Jonathan Kew.
Compiled with ICU version 74.2; using 74.2
Compiled with zlib version 1.3; using 1.3
Compiled with FreeType2 version 2.13.2; using 2.13.2
Compiled with Graphite2 version 1.3.14; using 1.3.14
Compiled with HarfBuzz version 8.3.0; using 8.3.0
Compiled with libpng version 1.6.43; using 1.6.43
Compiled with pplib version v2.05 less toxic i hope
Compiled with fontconfig version 2.15.0; using 2.15.0 And it has 4 CPUs with 8 GB RAM: hugovk@docs:~$ cat /proc/cpuinfo | grep processor
processor : 0
processor : 1
processor : 2
processor : 3
hugovk@docs:~$ free --human
total used free shared buff/cache available
Mem: 7.8Gi 1.6Gi 4.1Gi 4.3Mi 2.4Gi 6.2Gi
Swap: 1.0Gi 0B 1.0Gi |
Here's some observations and a suggestion. ObservationsWhilst checking the fix for #186, for the past couple of days I've been keeping an eye on the server build log and top processes ( Most of the time is spent on the logs
When this runs Sphinx, we sometimes get four
But most of the time it's running a latex command. These only use CPU, but at least generally use 100%:
Here's the time between the "Build start"/"Build done" pairs: table of build times
We're spending a LOT of time churning out those PDFs: an hour or two per language/version. Stepping back a moment, the cron entry is: 7 * * * * /srv/docsbuild/venv/bin/python /srv/docsbuild/scripts/build_docs.py This script builds all languages and versions, and currently takes about 30 hours for a full batch. The cron triggers at 7 minutes past the hour. If there's a build currently running, the new one just ends. Suggestion: split HTML and PDF buildsI suggest we have two cron jobs. One that builds only HTML, the other that does everything else except HTML.
In fact, I just spotted that #127 / python/psf-salt#236 said we used to build PDFs daily and the docs hourly. That PR was merged two years ago, shall we go back to this? |
I would even suggest building PDFs every 36/48 hours. The main problem with building PDFs less often seems to have been new releases (especially .0) don't have PDF versions at release date, but this seems surmountable. It seems that we could run multiple A |
Sounds promising. I can not really test, but wouldn't I tried running directly latexmk as in https://tex.stackexchange.com/a/699204 but on my system it does not exit once job completes one has to hit RET; besides one should not run it directly so I tried adding the But here it should be I guess the with the largest (I already admitted being on a single core system, so I can not vouch this will use multiple cores in real life). As per my earlier comment about |
It turns out we have inadvertently been running the full The LaTex build in Sphinx generates a makefile which we use to run latexmk and generate the actual PDFs from the Considering the generated Makefile at Resolving this is straightforwards, and I have a branch prepared to open a PR to CPython. EDIT: I have opened the PR at python/cpython#123113. Sample timings (8 core CPU, via WSL):
I am proposing to use the final option, as whilst not the absolute fastest, it helps to mitigate scheduling problems, and the docs server only has 4 CPU cores, per Hugo's comment. It seems that A |
Good catch! The blame is on me at sphinx-doc/sphinx@409605d, but to my defence former Makefile did 5 (pdf)latex builds each time. Actually re-reading your post I am wondering if
|
I think one has to make sure none of the 4 or 5 biggest files end up being handled by same core which may require a more complex layer to first list files as per decreasing filesizes of the
|
With @AA-Turner's latexmk improvements (#169 (comment)) we're saving about a third of the time! Here's a difference in the build times (not publish and so on) for a full set of 3.14 builds:
PS Here's the separate before and after times, and the script I used to get the times from the server build times. before
after
calc-times.py"""
Example log:
2024-08-14 09:07:13,383 INFO zh-tw/3.12: Build start.
2024-08-14 09:07:13,384 INFO zh-tw/3.12: Running make autobuild-stable
2024-08-14 09:07:13,384 DEBUG zh-tw/3.12: Run: "sed -i 's/ *-A switchers=1//' /srv/docsbuild/cpython/Doc/Makefile"
2024-08-14 09:07:13,392 DEBUG zh-tw/3.12: Run: "make -C /srv/docsbuild/cpython/Doc PYTHON=/srv/docsbuild/venv-3.12/bin/python SPHINXBUILD=/srv/docsbuild/venv-3.12/bin/sphinx-build BLURB=/srv/docsbuild/venv-3.12/bin/blurb VENVDIR=/srv/docsbuild/venv-3.12 'SPHINXOPTS=-D latex_engine=xelatex -D latex_elements.inputenc= -D latex_elements.fontenc=\\\\usepackage{xeCJK} -q -D locale_dirs=/srv/docsbuild/3.12/locale -D language=zh_TW -D gettext_compact=0' SPHINXERRORHANDLING= autobuild-stable"
2024-08-14 10:07:01,951 INFO: Another builder is running... dying...
2024-08-14 10:46:22,454 DEBUG zh-tw/3.12: Run: 'mkdir -p /var/log/docsbuild'
2024-08-14 10:46:22,462 DEBUG zh-tw/3.12: Run: 'chgrp -R docs /var/log/docsbuild'
2024-08-14 10:46:23,852 INFO zh-tw/3.12: Build done.
"""
import argparse
import datetime as dt
import glob
from functools import cache
from prettytable import MARKDOWN, PrettyTable
@cache
def format_seconds(seconds: float) -> str:
hours, minutes = divmod(seconds, 3600)
minutes, _ = divmod(minutes, 60)
hours, minutes = int(hours), int(minutes)
match (hours, minutes):
case 0, m:
return f"{m}m"
case h, m:
return f"{h}h {m}m"
def get_lines(logfiles: list[str]) -> list[str]:
lines = []
for logfile in logfiles:
if logfile.endswith(".gz"):
continue
with open(logfile) as f:
lines.extend(f.readlines())
return lines
def calc_time(lines: list[str]) -> None:
start = end = language_version = start_timestamp = None
table = PrettyTable()
table.set_style(MARKDOWN)
table.field_names = ["Start", "Language/version", "Build"]
table.align["Build"] = "r"
for line in lines:
line = line.strip()
if line.endswith("Build start."):
timestamp = line[:23].replace(",", ".")
language_version = line.split(" ")[3].removesuffix(":")
start = dt.datetime.strptime(timestamp, "%Y-%m-%d %H:%M:%S.%f")
start_timestamp = line[:16]
if start and line.endswith("Build done."):
timestamp = line[:23].replace(",", ".")
language_version = line.split(" ")[3].removesuffix(":")
end = dt.datetime.strptime(timestamp, "%Y-%m-%d %H:%M:%S.%f")
if start and end:
table.add_row(
[
start_timestamp,
language_version,
# format_seconds((end - start).total_seconds()),
(end - start).total_seconds()/60,
]
)
start = end = None
print(table)
def main():
parser = argparse.ArgumentParser()
parser.add_argument(
"logfiles", help="log file to read", nargs="?", default="docsbuild.log*"
)
args = parser.parse_args()
logfiles = glob.glob(args.logfiles)
lines = sorted(get_lines(logfiles))
calc_time(lines)
if __name__ == "__main__":
main() |
If you can build in an environment with a LaTeX distribution based upon TeXLive 2019 you will probably observe 30% or more additional gain. See this comment. I can not guarantee it completely because the comment compared builds on TL2019 and TL2024 and you use a Debian TL2023, so the gain might not be as much (I currently can not test on a TL2023). The PDFs will be exactly identical. edit: I mean will look and print identical. Perhaps things are different regarding copy-paste from PDF and some other matters. (I can not quantify at this time if the change from TL2023 to TL2024 induced the major part in slow-down of LaTeX kernel, or if it occurred earlier). |
Another 30% or more would be very welcome! But I'm not too sure if we should downgrade to an unsupported version? (Or how to do it.) |
A docker image is available for all TeXLive versions since 2014. The problem is that it contains the full thing which is much much more than what is strictly needed for Sphinx There is a sphinx-latexpdf Docker image used for our CI, if I knew the recipe I could modify it to use TeXLive2019. And in the process try to use a much smaller part of full TeXLive hence construct a much smaller Docker image. But this is time-consuming task for which I don't have the immediate qualifications. |
Update, we've cut the build times by around another third, by cutting about 15h per full build loop through dropping the letter PDF build (and keeping A4 PDF): And by about an extra half an hour by re-using an HTTPS connection for purging CDN URLs: |
It seems we're currently running around 10h 40m for each version -- Python 3.13 hasn't done a full rebuild in a while as the last commit was a fortnight ago (save around a dozen in the last half-hour!). This would be roughly 32 hours for a full rebuild of 3.12--3.14. Timings:
(reason is 'translation' if an updated translation caused the rebuild rather than updated docs, and 'FULL BUILD' captures the end of a full rebuild loop. The loops that are 'empty' but around 30m long attempt a build of A |
This initially caught me by surprise but is expected; the 3.13 branch is locked for RC2, so there's not been much merged on the branch recently: |
This is the recipe: https://github.com/csotomon/sphinx-docker/blob/master/latexpdf/Dockerfile |
That link is to a fork, though it's unclear -- the up-to-date dockerfile is at https://github.com/sphinx-doc/sphinx-docker-images/blob/master/latexpdf/Dockerfile |
I suggest that we close this issue in favour of #209, now that python/docs-community#131 has been implemented. The non-HTML archive builds still take a while, but it's a manageable duration and we now have the luxury of adjusting the frequency of those builds to e.g. once every two days rather than daily. Thank you to everyone involved for the help in reducing build times, we have made significant strides here in a challenging problem straddling multiple teams, projects, and repos. A |
Here's a summary of the current times, much improved: Yes, let's close and continue in other issues as needed. Thanks all! |
Today we're building 6 versions for 13 languages (that's 78 builds).
I'm doing some tests on my machine to get an idea:
make html
composed of:make text
make latex PAPER=a4
make all-pdf
(inbuild/latex/
) (less if run again, like 1 s if not removing PDFs, or 1 min after removing PDFs). Can be cut down to 6 min with-j 4
.So a complete rebuild should take ~27h (on my machine, the server may have a different CPU).
The text was updated successfully, but these errors were encountered: