<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="/assets/feed.xslt"?>
<rss version="2.0"
     xmlns:atom="http://www.w3.org/2005/Atom"
     xmlns:content="http://purl.org/rss/1.0/modules/content/"
     xmlns:wfw="http://wellformedweb.org/CommentAPI/">
<channel>
<title>Web logs of McSinyx</title>
<link>https://lumvok.store</link>
<atom:link href="https://lumvok.store/feed.xml" rel="self" type="application/rss+xml"/>
<description>Random write-ups packed with pop culture references</description>
<copyright><![CDATA[🄯 2019–2024 Nguyễn Gia Phong under CC BY-SA 4.0]]></copyright>
<language>en</language>
<generator>Franklin</generator>
<item>
  <title>De-Dependency December</title>
  <link>https://lumvok.store/blog/dedep/index.html</link>
  <guid>https://lumvok.store/blog/dedep/index.html</guid>
  <description>Call for Participation: De-Dependency December</description>
  <category>fun</category><category>pkg</category>
  <pubDate>Thu, 10 Nov 2022 00:00:00 +0000</pubDate>
  <content:encoded><![CDATA[
<h1 id="de-dependency_december">De-Dependency December</h1>
<blockquote>
<p>As we mature, the dependency graph matures with us.</p>
</blockquote>
<h2 id="exposition">Exposition</h2>
<p>In the <a href="https://www.youtube.com/watch?v&#61;stChOsejLEQ">occasional fights</a> between system and language packagers, <a href="https://man.sr.ht/~cnx/ipwhl">I&#39;m known to take the downstream camp.</a>  As a user, there are lots of things I take for granted.  I install the stuff I need, occasionally upgrade the system, and everything gets updated. Vulnerability in a library used by multiple programs?  Its patched version gets swapped in within a few hours &#40;given it&#39;s not <a href="https://blogs.gentoo.org/mgorny/2021/02/19/the-modern-packagers-security-nightmare">vendored or pinned</a>&#41;. <a href="https://wiki.debian.org/Hardening">Most</a> <a href="https://fedoraproject.org/wiki/Changes/Harden_All_Packages">distributions</a> <a href="https://wiki.archlinux.org/title/Arch_package_guidelines/Security">even</a> <a href="https://en.opensuse.org/openSUSE:Security_Features">apply</a> <a href="https://wiki.gentoo.org/wiki/Hardened/Toolchain#Changes">hardening</a> <a href="https://nixos.org/manual/nixpkgs/stable#sec-hardening-in-nixpkgs">flags</a> that <a href="https://xeiaso.net/blog/openssl-alarm-fatigue">some bugs aren&#39;t even exploitable in the first place</a>.  They create a <a href="https://www.youtube.com/watch?v&#61;205ODJgAEik">safe place</a> for me to comfortably express myself at work and at home.</p>
<p>Recently on my work computer, I&#39;ve switched to Guix System, which has yet many packages.  Looking into the way to package programs I use and ongoing efforts, I realized the colossal number of transitive dependencies of <a href="https://issues.guix.gnu.org/55903">certain software</a> and the impracticality for a user union &#40;i.e. a distro&#41; to maintain such set of <a href="https://raku-advent.blog/2021/12/06/unix_philosophy_without_leftpad">micro packages</a> in every language.</p>
<h2 id="confrontation">Confrontation</h2>
<p>This gave me a more serious thought on software sustainability.  Such topic often reminds us of energy consumption, modularity, development model, or even style &#40;clean code&#41;.  End-users &#40;including self-hosts&#41;, on the other hand, ask the following questions to decide upon installing and keeping a piece of software:</p>
<ul>
<li><p>Can I <em>trust</em> installing this won&#39;t do anything funny to my machine?</p>
</li>
<li><p>How much <a href="https://xkcd.com/303">effort</a> I need to prevent people from doing funny things to my machine if the software includes <a href="https://heartbleed.com">something that gets on the front page of some magazines</a> tomorrow?</p>
</li>
<li><p>How much of my limited resources will it take to run or <a href="https://ludocode.com/blog/flatpak-is-not-the-future">simply exist</a>?</p>
</li>
</ul>
<p>There are certain intersections in concerns of enterprises and users, however it&#39;s worth noticing that distributions are almost exclusively optimized to cater for the users&#39; need.  Not only they <a href="https://en.wikipedia.org/wiki/Tron">fight for the users</a>, they <em>are</em> the users.  Suppose you don&#39;t want to write yellow-glowing programs, you should <a href="https://drewdevault.com/2021/09/27/Let-distros-do-their-job.html">make the life of downstream package maintainers easier</a>. No, it does not count if you push them to give in to run <a href="https://github.com/NixOS/nixpkgs/blob/master/pkgs/build-support/go/module.nix"><code>go mod vendor</code></a> or <a href="https://github.com/svanderburg/node2nix">download from NPM recursively</a>.</p>
<h2 id="resolution">Resolution</h2>
<p><em>So how do I write software that is easy to package</em>, you may ask. If you followed the articles linked above, you&#39;ve probably already figured that out.  It&#39;s less about what you <em>write</em> and more about what you <em>use</em>.  When someone complains a program is difficult to build from source, certainly it&#39;s not about how hard it is to type, say <code>make install</code>, but acquiring the dependencies for that to run successfully and the result will work.</p>
<p>Lower the number of dependencies will absolutely help.  To put it bluntly, you can&#39;t have a problem with dependencies if there&#39;s none of them. This sounds like reinventing the wheel, but if the use case is common enough, you might find what you need in the standard library.<sup id="fnref:rust">[1]</sup> I&#39;ve been restricting myself from using third-party libraries for new side projects and it actually worked for my most recent ones:</p>
<ul>
<li><p><a href="https://trong.loang.net/phylactery">Phylactery</a>, a static comics web server on Go with <a href="https://en.wikipedia.org/wiki/Comic_book_archive">CBZ</a> parsing and concurrent request handling</p>
</li>
<li><p><a href="https://trong.loang.net/~cnx/fead">Fead</a>, an <a href="https://en.wikipedia.org/wiki/Static_site_generator">SSG</a> plugin in Python for advertising others’ feed with parallel HTTP request, parsing of RSS 2 and Atom and CLI argument parsing</p>
</li>
</ul>
<p>Even for such simple use cases, there are still many libraries in the wild that can handle more data formats, are more convenient to use or more performant.  On the other hand, the amount of maintenance needed to keep the programs safe indefinitely for a user is much lower thanks to the small dependency footprint.</p>
<p>What I&#39;m asking you to give a try in the advent days<sup id="fnref:advent">[2]</sup> is not as drastic. Look through your works, find a library you require for a small portion of its <a href="https://www.youtube.com/watch?v&#61;3Mpyias9ek4">power</a>, or something can be implemented specifically for your project using reasonable effort &#40;w.r.t. the whole codebase&#41;.  This is not just for the sake of maintainability: <a href="https://guide.handmade-seattle.com/c/2021/context-is-everything">being less general, the new implementation can likely outperform the replaced public library</a>.</p>
<p><img src="https://lumvok.store/assets/outlets.jpg" alt="Multiple types of sockets installed on the same wall" /></p>
<p>In many cases, you will find yourself making use of the standard library. Standards make life much easier, <a href="https://xkcd.com/927">if only</a> people can come up with an agreement.  Or maybe they don&#39;t have to.  Maybe each could choose among the <a href="https://raku-advent.blog/2021/12/11/unix_philosophy_without_leftpad_part2">utilities libraries</a>.  At the end of the day, it&#39;s the total number of packages that can have bugs to be reported upstream and patched that matters.</p>
<p>That being said, please keep an eye on the standard library the same way you &#40;should&#41; watch your other dependencies, just in case what you need is finally added.  Worry not of backward incompatibility, <a href="https://wiki.debian.org/DontBreakDebian#Don.27t_suffer_from_Shiny_New_Stuff_Syndrome">users of LTS systems are content with older versions</a> of your software.</p>
<h2 id="fall_and_catastrophe">Fall and Catastrophe</h2>
<p>Just kidding, I&#39;m offering <a href="https://en.wikipedia.org/wiki/Three-act_structure">answers</a>, not <a href="https://en.wikipedia.org/wiki/Dramatic_structure#Freytag&#39;s_pyramid">tragedies</a>.  Winter is coming, join me in a De-Dependency December and fight for the users&#33;</p>
<table class="fndef" id="fndef:rust">
    <tr>
        <td class="fndef-backref">[1]</td>
        <td class="fndef-content">Unless you use Rust.</td>
    </tr>
</table><table class="fndef" id="fndef:advent">
    <tr>
        <td class="fndef-backref">[2]</td>
        <td class="fndef-content">I&#39;m not Christian, but I had fun with <a href="https://adventofcode.com">AoC</a> and <a href="https://breezewiki.com/neopets/wiki/Advent_Calendar">Neopets</a> before.</td>
    </tr>
</table>    <a href="mailto:cnx.site@loa.loang.net?In-Reply-To=%3Cblog/dedep@cnx%3E&Subject=Re: De-Dependency December">Reply via email</a>]]></content:encoded>
  <comments><![CDATA[https://lists.sr.ht/~cnx/site?search=In-Reply-To:%3Cblog/dedep@cnx%3E]]></comments>
  <wfw:commentRss>https://lumvok.store/blog/dedep/comments.xml</wfw:commentRss>
</item>
<item>
  <title>Google Summer of Code 2020</title>
  <link>https://lumvok.store/blog/2020/gsoc/index.html</link>
  <guid>https://lumvok.store/blog/2020/gsoc/index.html</guid>
  <description>GSoC 2020 final report</description>
  <category>fun</category><category>exp</category><category>gsoc</category><category>pkg</category><category>pip</category>
  <pubDate>Mon, 31 Aug 2020 00:00:00 +0000</pubDate>
  <content:encoded><![CDATA[
<h1 id="google_summer_of_code_2020">Google Summer of Code 2020</h1>
<p>In the summer of 2020, I worked with the contributors of <code>pip</code>, trying to improve the networking performance of the package manager. Admittedly, at the end of <a href="https://summerofcode.withgoogle.com/archive/2020/projects/6238594655584256">the internship</a> period, <a href="https://lumvok.store/blog/2020/gsoc/article/7/#the_benchmark">the benchmark said otherwise</a>; though I really hope the clean-up and minor fixes I happened to be doing to the codebase over the summer, in addition to the implementation of parallel utils and lazy wheel, might actually help the project.</p>
<p>Personally, I learned a lot: not just about Python packaging and networking stuff, but also on how to work with others.  I am really grateful to <a href=https://github.com/pradyunsg>@pradyunsg</a> &#40;my mentor&#41;, <a href=https://github.com/chrahunt>@chrahunt</a>, <a href=https://github.com/uranusjr>@uranusjr</a>, <a href=https://github.com/pfmoore>@pfmoore</a>, <a href=https://github.com/brainwane>@brainwane</a>, <a href=https://github.com/sbidoul>@sbidoul</a>, <a href=https://github.com/xavfernandez>@xavfernandez</a>, <a href=https://github.com/webknjaz>@webknjaz</a>, <a href=https://github.com/jaraco>@jaraco</a>, <a href=https://github.com/deveshks>@deveshks</a>, <a href=https://github.com/gutsytechster>@gutsytechster</a>, <a href=https://github.com/dholth>@dholth</a>, <a href=https://github.com/dstufft>@dstufft</a>, <a href=https://github.com/cosmicexplorer>@cosmicexplorer</a> and <a href=https://github.com/ofek>@ofek</a>.  While this feels like a long shout-out list, it really isn&#39;t.  These people are the maintainers, the contributors of <code>pip</code> and/or other Python packaging projects, and more importantly, they have been more than helpful, encouraging and patient to me throughout my every activities, showing me the way when I was lost, fixing me when I was wrong, putting up with my carelessness and showing me support across different social media.</p>
<p>To best serve the community, below I have tried my best to document what I have done, how I&#39;ve done it and why I&#39;ve done it for over the last three months.  At the time of writing, some work is still in progress, so these also serve as a reference point for myself and others to reason about decisions in relevant topics.</p>
<div class="franklin-toc"><ol><li>The Main Story<ol><li>Act One: Parallelization Utilities</li><li>Act Two: Lazy Wheels</li><li>Act Three: Late Downloading</li><li>Act Four: Batch Downloading in Parallel</li></ol></li><li>The Plot Summary</li></ol></div>
<h2 id="the_main_story">The Main Story</h2>
<p>The storyline can be divided into the following four main acts.</p>
<h3 id="act_one_parallelization_utilities">Act One: Parallelization Utilities</h3>
<p>In this first act, I ensured the portibility of parallelization measures for later use in the final act.  Multithreading and multiprocessing <code>map</code> were properly fellback on platforms without full support.</p>
<ul>
<li><p><a href=https://github.com/pypa/pip/pull/8320>GH-8320</a>: Add utilities for parallelization &#40;close <a href=https://github.com/pypa/pip/pull/8169>GH-8169</a>&#41;</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8538>GH-8538</a>: Make <code>utils.parallel</code> tests tear down properly</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8504>GH-8504</a>: Parallelize <code>pip list --outdated</code> and <code>--uptodate</code> &#40;using <a href=https://github.com/pypa/pip/pull/8320>GH-8320</a>&#41;</p>
</li>
</ul>
<h3 id="act_two_lazy_wheels">Act Two: Lazy Wheels</h3>
<p>As proposed by <a href=https://github.com/cosmicexplorer>@cosmicexplorer</a> in <a href=https://github.com/pypa/pip/pull/7819>GH-7819</a>, it is possible to only download a portion of a wheel to obtain metadata during dependency resolution. Not only that this would reduce the total amount of data to be transmitted over the network in case the resolver needs to perform heavy backtracking, but also it would create a synchronization point at the end of the resolution progress where parallel downloading can be applied to the needed wheels &#40;some wheels solely serve their metadata during dependency backtracking and are not needed by the users&#41;.</p>
<ul>
<li><p><a href=https://github.com/pypa/pip/pull/8467>GH-8467</a>: Add utitlity to lazily acquire wheel metadata over HTTP</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8584>GH-8584</a>: Revise lazy wheel and its tests</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8681>GH-8681</a>: Make range requests closer to chunk size &#40;help <a href=https://github.com/pypa/pip/pull/8670>GH-8670</a>&#41;</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8716>GH-8716</a> and <a href=https://github.com/pypa/pip/pull/8730>GH-8730</a>: Disable caching for range requests</p>
</li>
</ul>
<h3 id="act_three_late_downloading">Act Three: Late Downloading</h3>
<p>During this act, the main works were refactoring to integrate the <em>lazy wheel</em> into <code>pip</code>&#39;s codebase and clean up the way for download parallelization.</p>
<ul>
<li><p><a href=https://github.com/pypa/pip/pull/8411>GH-8411</a>: Refactor <code>operations.prepare.prepare_linked_requirement</code></p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8629>GH-8629</a>: Abstract away <code>AbstractDistribution</code> in higher-level resolver code</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8442>GH-8442</a>, <a href=https://github.com/pypa/pip/pull/8532>GH-8532</a> and <a href=https://github.com/pypa/pip/pull/8588>GH-8588</a> &#40;later reworked by <a href=https://github.com/chrahunt>@chrahunt</a> in <a href=https://github.com/pypa/pip/pull/8685>GH-8685</a>&#41;: Use lazy wheel to obtain dependency information for the new resolver</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8743>GH-8743</a>: Test hash checking for <code>fast-deps</code></p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8804>GH-8804</a>: Check download directory before making range requests</p>
</li>
</ul>
<h3 id="act_four_batch_downloading_in_parallel">Act Four: Batch Downloading in Parallel</h3>
<p>The final act is mostly about the UI of the parallel download. My work involved around how the progress should be displayed and how other relevant information should be reported to the users.</p>
<ul>
<li><p><a href=https://github.com/pypa/pip/pull/8710>GH-8710</a>: Revise method fetching metadata using lazy wheels</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8722>GH-8722</a>: Dedent late download logs &#40;fix <a href=https://github.com/pypa/pip/pull/8721>GH-8721</a>&#41;</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8737>GH-8737</a>: Add a hook for batch downloading</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8771>GH-8771</a>: Parallelize wheel download</p>
</li>
</ul>
<h2>The Side Quests</h2>
<p>In order to keep the wheel turning &#40;no pun intended&#41; and avoid wasting time waiting for the pull requests above to be reviewed, I decided to create even more PRs &#40;as I am typing this, many of the patches listed below are nowhere near being merged&#41;.</p>
<ul>
<li><p><a href=https://github.com/pypa/pip/pull/7878>GH-7878</a>: Fail early when install path is not writable</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/7928>GH-7928</a>: Fix rst syntax in Getting Started guide</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/7988>GH-7988</a>: Fix tabulate col size in case of empty cell</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8137>GH-8137</a>: Add subcommand alias mechanism</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8143>GH-8143</a>: Make mypy happy with beta release automation</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8248>GH-8248</a>: Fix typo and simplify ireq call</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8332>GH-8332</a>: Add license requirement to <code>_vendor/README.rst</code></p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8423>GH-8423</a>: Nitpick logging calls</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8435>GH-8435</a>: Use str.format style in logging calls</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8456>GH-8456</a>: Lint <code>src/pip/_vendor/README.rst</code></p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8568>GH-8568</a>: Declare constants in configuration.py as such</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8571>GH-8571</a>: Clean up <code>Configuration.unset_value</code> and nit <code>__init__</code></p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8578>GH-8578</a>: Allow verbose/quiet level to be specified via config files and environment variables</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8599>GH-8599</a>: Replace tabs by spaces for consistency</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8614>GH-8614</a>: Use <code>monkeypatch.setenv</code> to mock environment variables</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8674>GH-8674</a>: Fix <code>tests/functional/test_install_check.py</code>, when run with new resolver</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8692>GH-8692</a>: Make assertion failure give better message</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8709>GH-8709</a>: List downloaded distributions before exiting &#40;fix <a href=https://github.com/pypa/pip/pull/8696>GH-8696</a>&#41;</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8759>GH-8759</a>: Allow py2 deprecation warning from setuptools</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8766>GH-8766</a>: Use the new resolver for test requirements</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8790>GH-8790</a>: Mark tests using remote svn and hg as xfail</p>
</li>
<li><p><a href=https://github.com/pypa/pip/pull/8795>GH-8795</a>: Reformat a few spots in user guide</p>
</li>
</ul>
<h2 id="the_plot_summary">The Plot Summary</h2>
<p>Every Monday throughout the Summer of Code, I summarized what I had done in the week before in the form of either a short blog or an &#40;even shorter&#41; check-in.  These write-ups often contain handfuls of popular culture references and was originally hosted on <a href="https://blogs.python-gsoc.org/en/mcsinyxs-blog">Python GSoC</a>.</p>
<ul>
<li><p><a href=https://lumvok.store/blog/2020/gsoc/checkin/1>First Check-In</a></p>
</li>
<li><p><a href=https://lumvok.store/blog/2020/gsoc/article/1>Unexpected Things When You&#39;re Expecting</a></p>
</li>
<li><p><a href=https://lumvok.store/blog/2020/gsoc/checkin/2>Second Check-In</a></p>
</li>
<li><p><a href=https://lumvok.store/blog/2020/gsoc/article/2>The Wonderful Wizard of O&#39;zip</a></p>
</li>
<li><p><a href=https://lumvok.store/blog/2020/gsoc/checkin/3>Third Check-In</a></p>
</li>
<li><p><a href=https://lumvok.store/blog/2020/gsoc/article/3>I&#39;m Not Drowning On My Own</a></p>
</li>
<li><p><a href=https://lumvok.store/blog/2020/gsoc/checkin/4>Fourth Check-In</a></p>
</li>
<li><p><a href=https://lumvok.store/blog/2020/gsoc/article/4>I&#39;ve Walked 500 Miles…</a></p>
</li>
<li><p><a href=https://lumvok.store/blog/2020/gsoc/checkin/5>Fifth Check-In</a></p>
</li>
<li><p><a href=https://lumvok.store/blog/2020/gsoc/article/5>Sorting Things Out</a></p>
</li>
<li><p><a href=https://lumvok.store/blog/2020/gsoc/checkin/6>Sixth Check-In</a></p>
</li>
<li><p><a href=https://lumvok.store/blog/2020/gsoc/article/6>Parallelizing Wheel Downloads</a></p>
</li>
<li><p><a href=https://lumvok.store/blog/2020/gsoc/checkin/7>Final Check-In</a></p>
</li>
<li><p><a href=https://lumvok.store/blog/2020/gsoc/article/7>Outro</a></p>
</li>
</ul>    <a href="mailto:cnx.site@loa.loang.net?In-Reply-To=%3Cblog/2020/gsoc@cnx%3E&Subject=Re: Google Summer of Code 2020">Reply via email</a>]]></content:encoded>
  <comments><![CDATA[https://lists.sr.ht/~cnx/site?search=In-Reply-To:%3Cblog/2020/gsoc@cnx%3E]]></comments>
  <wfw:commentRss>https://lumvok.store/blog/2020/gsoc/comments.xml</wfw:commentRss>
</item>
</channel></rss>