From 7cdfee36fcc8a937c590d6822035e25b05481571 Mon Sep 17 00:00:00 2001 From: Collin Lefeber Date: Mon, 17 Jun 2024 23:02:33 -0400 Subject: [PATCH] build_a_blog: rm index.xml, intro --- .gitignore | 1 + index.xml | 522 ------------------------------------------ posts/build_a_blog.md | 17 +- 3 files changed, 15 insertions(+), 525 deletions(-) delete mode 100644 index.xml diff --git a/.gitignore b/.gitignore index 77c8812..b2a5535 100644 --- a/.gitignore +++ b/.gitignore @@ -1,2 +1,3 @@ index.html posts/*.html +index.xml diff --git a/index.xml b/index.xml deleted file mode 100644 index 78e2964..0000000 --- a/index.xml +++ /dev/null @@ -1,522 +0,0 @@ - - - - cfebs.com - https://cfebs.com - Recent content from cfebs.com - en - Mon, 17 Jun 2024 19:49:00 -0000 - - - - Build-a-blog - https://cfebs.com/posts/build_a_blog.html - Mon, 17 Jun 2024 14:46:36 -0400 - https://cfebs.com/posts/build_a_blog.html - <p>I want to share my thought process for how to go about building a static blog generator from scratch.</p> -<p>The goal is to take 1 afternoon + caffeine + some DIY spirit → <em>something</em> resembling a static site/blog generator.</p> -<p>Lets see how hard this will be. Here's what a blog is/requirements:</p> -<ul> -<li>Generate an index with recent list of posts.</li> -<li>Generate each individual post written in markdown -&gt; html<ul> -<li>Support some metadata in each post</li> -<li>A post title should have a slug</li> -</ul> -</li> -<li>Generate RSS</li> -</ul> -<p>That boils down to:</p> -<ol> -<li>Read some files</li> -<li>Parse markdown, maybe parse a header with some key/values.</li> -<li>Template strings</li> -</ol> -<p>So there is 1 "exotic" feature in parsing/rendering Markdown as HTML.</p> -<p>The rest is just file and string manipulation.</p> -<p>Most scripting languages would be fine tools for this task. But how to handle Markdown?</p> -<h2 id="picking-the-tool-for-the-job">Picking the tool for the job</h2> -<p>I've had <a href="https://crystal-lang.org/">Crystal</a> in the back of my mind for this task. It is a nice general purpose language that included Markdown in the stdlib! But unfortunately Markdown was removed in <a href="https://github.com/crystal-lang/crystal/releases/tag/0.31.0">0.31.0</a>. Other than that, I'm not sure any other languages include a well rounded Markdown implementation out of the box.</p> -<p>I'll likely be building the site in docker with an alpine image, so just a quick search in alpines repos to see what could be useful:</p> -<pre><code class="language-shell">❯ docker run --rm -it alpine -/ # apk update -fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/APKINDEX.tar.gz -fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/community/x86_64/APKINDEX.tar.gz -v3.18.6-263-g77db018514d [https://dl-cdn.alpinelinux.org/alpine/v3.18/main] -v3.18.6-263-g77db018514d [https://dl-cdn.alpinelinux.org/alpine/v3.18/community] -OK: 20079 distinct packages available -/ # apk search markdown -discount-2.2.7c-r1 -discount-dev-2.2.7c-r1 -discount-libs-2.2.7c-r1 -kdepim-addons-23.04.3-r0 -markdown-1.0.1-r3 -markdown-doc-1.0.1-r3 -py3-docstring-to-markdown-0.12-r1 -py3-docstring-to-markdown-pyc-0.12-r1 -py3-html2markdown-0.1.7-r3 -py3-html2markdown-pyc-0.1.7-r3 -py3-markdown-3.4.3-r1 -py3-markdown-it-py-2.2.0-r1 -py3-markdown-it-py-pyc-2.2.0-r1 -py3-markdown-pyc-3.4.3-r1 -</code></pre> -<p><a href="https://pkgs.alpinelinux.org/package/edge/main/x86_64/py3-markdown"><code>py3-markdown</code> in alpine</a> is the popular <a href="https://python-markdown.github.io/"><code>python-markdown</code></a>. It's mature and available as a package in my <a href="https://archlinux.org/packages/extra/any/python-markdown/">home distro</a>.</p> -<p>With that, we should have the exotic Markdown dependency figured out.</p> -<h2 id="lets-build">Let's build</h2> -<p>First, lets read 1 post file and render some html.</p> -<p>We'll store posts in <code>posts/</code> like <code>posts/build_a_blog.md</code>.</p> -<p>And we'll store the HTML output in the same directory: <code>posts/build_a_blog.html</code>.</p> -<pre><code class="language-python">import re -import logging - -import markdown -destpath_re = re.compile(r'\.md$') - -logging.basicConfig(encoding='utf-8', level=logging.INFO) - -def render_post(fpath): - destpath = destpath_re.sub('.html', fpath) - logging.info(&quot;opening %s for parsing, dest %s&quot;, fpath, destpath) - # from: https://python-markdown.github.io/reference/ - with open(fpath, &quot;r&quot;, encoding=&quot;utf-8&quot;) as input_file: - logging.info(&quot;reading %s&quot;, fpath) - text = input_file.read() - - logging.info(&quot;parsing %s&quot;, fpath) - out = markdown.markdown(text) - - with open(destpath, &quot;w&quot;, encoding=&quot;utf-8&quot;, errors=&quot;xmlcharrefreplace&quot;) as output_file: - logging.info(&quot;writing to %s&quot;, destpath) - output_file.write(out) - -if __name__ == '__main__': - render_post('posts/build_a_blog.md') -</code></pre> -<p>And if we run it.</p> -<pre><code class="language-shell">❯ python3 ./main.py -INFO:root:opening posts/build_a_blog.md for parsing, dest posts/build_a_blog.html -INFO:root:reading posts/build_a_blog.md -INFO:root:parsing posts/build_a_blog.md -INFO:root:writing to posts/build_a_blog.html -</code></pre> -<p>Looking pretty good.</p> -<pre><code>❯ head posts/build_a_blog.html -&lt;h1&gt;Build-a-blog&lt;/h1&gt; -&lt;p&gt;I want to share my thought process for how one would go about building a static blog generator from scratch.&lt;/p&gt; -&lt;ul&gt; -&lt;li&gt;Generate an index with recent list of posts.&lt;/li&gt; -&lt;li&gt;Generate each individual post written in markdown -&amp;gt; html&lt;ul&gt; -&lt;li&gt;Support some metadata in each post&lt;/li&gt; -&lt;li&gt;A post title should have a slug&lt;/li&gt; -&lt;/ul&gt; -&lt;/li&gt; -&lt;li&gt;Generate RSS&lt;/li&gt; -</code></pre> -<p>Now lets do this for all <code>.md</code> files in <code>posts/</code></p> -<pre><code class="language-python">import glob -... - -def render_posts(): - files = glob.glob('posts/*.md') - logging.info('found post files %s', files) - for fname in files: - render_post(fname) - -if __name__ == '__main__': - render_posts() -</code></pre> -<p>And add another simple test post</p> -<pre><code class="language-shell">❯ echo '# A new post' &gt; ./posts/a_new_post.md -❯ python3 ./main.py -INFO:root:found post files ['posts/a_new_post.md', 'posts/build_a_blog.md'] -INFO:root:opening posts/a_new_post.md for parsing, dest posts/a_new_post.html -INFO:root:reading posts/a_new_post.md -INFO:root:parsing posts/a_new_post.md -INFO:root:writing to posts/a_new_post.html -INFO:root:opening posts/build_a_blog.md for parsing, dest posts/build_a_blog.html -INFO:root:reading posts/build_a_blog.md -INFO:root:parsing posts/build_a_blog.md -INFO:root:writing to posts/build_a_blog.html -❯ head ./posts/a_new_post.html -&lt;h1&gt;A new post&lt;/h1&gt; -</code></pre> -<p>Basically at this point, it's a blog generator!</p> -<p>But I want a few more features:</p> -<ul> -<li>Want the posts listed in the index sorted by date.</li> -<li>Want each post to be templated in some html wrapper.</li> -</ul> -<h2 id="post-ordering-and-templating">Post ordering and templating</h2> -<p><code>python-markdown</code> supports metadata embedded in posts: <a href="https://python-markdown.github.io/extensions/meta_data/">https://python-markdown.github.io/extensions/meta_data/</a></p> -<p>I thought I'd need to build something here, but turns out it's exactly what I need to assign a few extra attributes to a post.</p> -<p>We'll adjust our "spec" for posts such that each post must include the following metadata at the top of the file:</p> -<pre><code class="language-txt">Title: Build-a-blog -Date: 2024-06-17T14:46:36-04:00 ---- -</code></pre> -<p>And I'd like to insert the <code>Title</code> automatically as a <code>&lt;h1&gt;</code> tag in each post so I don't have to write it again in the markdown.</p> -<p>So first, lets test the metadata and adjust the test blog post.</p> -<pre><code class="language-shell">❯ head -n4 ./posts/build_a_blog.md -Title: Build-a-blog -Date: 2024-06-17T14:46:36-04:00 ---- -</code></pre> -<p>And pop open a python repl to see how this works.</p> -<pre><code class="language-python">&gt;&gt;&gt; md = markdown.Markdown(extensions = ['meta']); f = open('posts/build_a_blog.md', 'r'); txt = f.read(); out = md.convert(txt); md.Meta -{'title': ['Build-a-blog'], 'date': ['2024-06-17T14:46:36-04:00']} -</code></pre> -<p>Looks pretty nice!</p> -<p>So first I will adjust the rendering function to prepend a</p> -<pre><code class="language-markdown"># {title} -</code></pre> -<p>Line just after we read the file and extract the metadata.</p> -<pre><code class="language-python">def render_post(fpath): - ... - - md = markdown.Markdown(extensions = ['meta']) - - logging.info(&quot;parsing %s&quot;, fpath) - out = md.convert(text) - - title = md.Meta.get('title')[0] - date = md.Meta.get('date')[0] - - out = markdown.markdown('# ' + title) + out -</code></pre> -<p>Finally, lets return a structure that will make other parts of the program aware of the filename that was rendered and the metadata (title, date)</p> -<pre><code class="language-python">def render_post(fpath): - ... - out = markdown.markdown('# ' + title) + out - - with open(destpath, &quot;w&quot;, encoding=&quot;utf-8&quot;, errors=&quot;xmlcharrefreplace&quot;) as output_file: - logging.info(&quot;writing to %s&quot;, destpath) - output_file.write(out) - - return { - 'title': title, - 'date': date, - 'fpath': fpath, - 'destpath': destpath, - } -</code></pre> -<p>Now we have what we need to generate a complete index.</p> -<h3 id="index-templating">Index templating</h3> -<p>Lets start by defining what our index template file will be.</p> -<p>I'll choose <code>index.html.tmpl</code> and after rendering we will write to <code>index.html</code>.</p> -<p>So lets make a function that will take a list of our post structure above and render it in a <code>&lt;ul&gt;</code>.</p> -<pre><code>from string import Template -... -def posts_list_html(posts): - post_tpl = &quot;&quot;&quot;&lt;li&gt; - &lt;a href=&quot;{href}&quot;&gt;{title}&lt;/a&gt; - &lt;time datetime=&quot;{date}&quot;&gt;{disp_date}&lt;/time&gt; - &lt;/li&gt;&quot;&quot;&quot; - out = '&lt;ul class=&quot;blog-posts-list&quot;&gt;' - for post in posts: - disp_date = datetime.datetime.fromisoformat(post.get('date')).strftime('%Y-%m-%d') - out += post_tpl.format(href=post.get('destpath'), - title=post.get('title'), - date=post.get('date'), - disp_date=disp_date) - return out + '&lt;/ul&gt;' - -def render_index(posts): - fname = 'index.html.tmpl' - outname = 'index.html' - - with open(fname, 'r', encoding='utf-8') as inf: - tmpl = Template(inf.read()) - - posts_html = posts_html(posts) - - html = tmpl.substitute(posts=posts_html) - - with open(outname, 'w', encoding='utf-8') as outf: - outf.write(html) -</code></pre> -<p>Make sure that <code>index.html.tmpl</code> contains a template variable for <code>${posts}</code></p> -<pre><code class="language-shell">❯ grep -C2 '\${posts}' ./index.html.tmpl - &lt;div class=&quot;col-md-8 col-sm-12&quot;&gt; - &lt;p&gt;Welcome. Something will go here eventually.&lt;/p&gt; - ${posts} - &lt;/div&gt; - &lt;div class=&quot;col-md-4 col-sm-12&quot;&gt; -</code></pre> -<p>And we now need to connect <code>render_posts()</code> which returns each post that was processed to <code>render_index()</code></p> -<pre><code class="language-python">def render_posts(): - files = glob.glob('posts/*.md') - logging.info('found post files %s', files) - posts = [] - for fname in files: - p = render_post(fname) - posts.append(p) - logging.info('rendered post: %s', p) - - return posts - -if __name__ == '__main__': - posts = render_posts() - logging.info('rendered posts: %s', posts) - render_index(posts) -</code></pre> -<p>And lets run it!</p> -<pre><code class="language-shell">❯ python3 ./main.py -INFO:root:found post files ['posts/a_new_post.md', 'posts/build_a_blog.md'] -INFO:root:opening posts/a_new_post.md for parsing, dest posts/a_new_post.html -INFO:root:reading posts/a_new_post.md -INFO:root:parsing posts/a_new_post.md -INFO:root:writing to posts/a_new_post.html -INFO:root:rendered post: {'title': 'A new post', 'date': '2024-06-17T15:09:26-04:00', 'fpath': 'posts/a_new_post.md', 'destpath': 'posts/a_new_post.html'} -INFO:root:opening posts/build_a_blog.md for parsing, dest posts/build_a_blog.html -INFO:root:reading posts/build_a_blog.md -INFO:root:parsing posts/build_a_blog.md -INFO:root:writing to posts/build_a_blog.html -INFO:root:rendered post: {'title': 'Build-a-blog', 'date': '2024-06-17T14:46:36-04:00', 'fpath': 'posts/build_a_blog.md', 'destpath': 'posts/build_a_blog.html'} -INFO:root:rendered posts: [{'title': 'A new post', 'date': '2024-06-17T15:09:26-04:00', 'fpath': 'posts/a_new_post.md', 'destpath': 'posts/a_new_post.html'}, {'title': 'Build-a-blog', 'date': '2024-06-17T14:46:36-04:00', 'fpath': 'posts/build_a_blog.md', 'destpath': 'posts/build_a_blog.html'}] -</code></pre> -<p>And check how the output looks:</p> -<pre><code class="language-shell">❯ grep -C4 'blog-posts-list' ./index.html - &lt;/nav&gt; - &lt;section class=&quot;container&quot;&gt; - &lt;div class=&quot;row&quot;&gt; - &lt;div class=&quot;col-md-8 col-sm-12&quot;&gt; - &lt;ul class=&quot;blog-posts-list&quot;&gt;&lt;li&gt; - &lt;a href=&quot;posts/a_new_post.html&quot;&gt;A new post&lt;/a&gt; - &lt;time datetime=&quot;2024-06-17T19:48:17-04:00&quot;&gt;2024-06-17&lt;/time&gt; - &lt;/li&gt;&lt;li&gt; - &lt;a href=&quot;posts/build_a_blog.html&quot;&gt;Build-a-blog&lt;/a&gt; -</code></pre> -<p>Not bad!</p> -<h3 id="post-templating">Post templating</h3> -<p>I think I want my blog to just maintain the overall layout from the index page and just render the post body where the main post list is.</p> -<p>So lets make that template rendering a bit more general.</p> -<p>We'll redefine the content area template variable to replace as <code>${content}</code> too.</p> -<pre><code class="language-python">def render_template(tpl_fname, out_fname, content_html): - with open(tpl_fname, 'r', encoding='utf-8') as inf: - tmpl = Template(inf.read()) - - html = tmpl.substitute(content=content_html) - - with open(out_fname, 'w', encoding='utf-8') as outf: - outf.write(html) - -def render_index(posts): - content_html = posts_list_html(posts) - render_template('index.html.tmpl', 'index.html', content_html) - outf.write(out) -</code></pre> -<p>And now adjust where posts are written out.</p> -<pre><code class="language-python">def render_post(fpath): - ... - out = markdown.markdown('# ' + title) + out - logging.info(&quot;writing to %s&quot;, destpath) - render_template('index.html.tmpl', destpath, html) -</code></pre> -<p>After running you should see the each <code>post/*.html</code> file where each post file uses the full index template and includes each generated post HTML.</p> -<h3 id="post-sorting">Post sorting</h3> -<p>With everything wired up now we just need to sort the posts lists by the date metadata.</p> -<p>Lets do a bit of python repl sort testing because I never remember <code>datetime</code> usage.</p> -<p>Lets generate a few nicely formatted ISO date strings for testing.</p> -<pre><code class="language-shell">❯ date -d'2023-01-01' -Is -2023-01-01T00:00:00-05:00 -❯ date -Is -2024-06-17T16:30:35-04:00 -</code></pre> -<p>And make a test array</p> -<pre><code class="language-python">&gt;&gt;&gt; posts = [{'date': '2023-01-01T00:00:00-05:00'}, {'date': '2024-06-17T16:30:35-04:00'}] -</code></pre> -<p>With our current script, the older post would be listed first. So lets try a sort.</p> -<pre><code># Double checking datetime parsing -&gt;&gt;&gt; import datetime -&gt;&gt;&gt; newer = datetime.datetime.fromisoformat('2024-06-17T16:30:35-04:00') -datetime.datetime(2024, 6, 17, 16, 30, 35, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=72000))) -&gt;&gt;&gt; older = datetime.datetime.fromisoformat('2024-06-17T16:30:35-04:00') -datetime.datetime(2024, 6, 17, 16, 30, 35, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=72000))) - -# Checking python sorting methods work as expected -&gt;&gt;&gt; newer.__gt__(older) -True -&gt;&gt;&gt; newer.__lt__(older) -False -&gt;&gt;&gt; older.__gt__(newer) -False -&gt;&gt;&gt; older.__lt__(newer) -True - -# Doing the sort -&gt;&gt;&gt; sorted(posts, key=lambda x: datetime.datetime.fromisoformat(x['date']), reverse=True) -[{'date': '2024-06-17T16:30:35-04:00'}, {'date': '2023-01-01T00:00:00-05:00'}] -</code></pre> -<p>Now lets apply this to our posts.</p> -<pre><code class="language-python">if __name__ == '__main__': - posts = render_posts() - logging.info('rendered posts: %s', posts) - sorted_posts = sorted(posts, - key=lambda p: datetime.datetime.fromisoformat(p['date']), reverse=True) - render_index(sorted_posts) -</code></pre> -<h3 id="title-templating"><code>&lt;title /&gt;</code> Templating</h3> -<p>The last bit of templating is to make each post <code>&lt;title&gt;</code> different.</p> -<p>I'll try something like <code>&lt;title&gt;cfebs.com - ${title}&lt;/title&gt;</code></p> -<p>So <code>index.html.tmpl</code></p> -<pre><code class="language-html">&lt;title&gt;cfebs.com${more_title}&lt;/title&gt; -</code></pre> -<p>And where we're using the title template <code>more_title</code> will default to empty string.</p> -<pre><code class="language-python">def render_index(posts): - content_html = posts_list_html(posts) - render_template('index.html.tmpl', 'index.html', {'content': content_html, 'more_title': ''}) -</code></pre> -<p>But for a post:</p> -<pre><code class="language-python">def render_post(fpath): - ... - title = md.Meta.get('title')[0] - date = md.Meta.get('date')[0] - - out = markdown.markdown('# ' + title) + out - - logging.info(&quot;writing to %s&quot;, destpath) - render_template('index.html.tmpl', destpath, {'content': out, 'more_title': ' - ' + title}) -</code></pre> -<p>At this point we have functioning blog post generation with templating.</p> -<h2 id="rss">RSS</h2> -<p>This should be pretty easy as RSS is just reformatting our blog index list into different XML.</p> -<p>The <code>render_template</code> function will be useful here with a few more tweaks. So I'll make another template file (based off a reference <a href="https://drewdevault.com/blog/index.xml">https://drewdevault.com/blog/index.xml</a>)</p> -<pre><code class="language-shell"># Grab the reference -❯ curl -sL 'https://drewdevault.com/blog/index.xml' &gt; index.xml.example - -# After a bit of editing -❯ cat ./index.xml.tmpl -&lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot; standalone=&quot;yes&quot;?&gt; -&lt;rss version=&quot;2.0&quot; xmlns:atom=&quot;http://www.w3.org/2005/Atom&quot;&gt; - &lt;channel&gt; - &lt;title&gt;${site_title}&lt;/title&gt; - &lt;link&gt;${site_link}&lt;/link&gt; - &lt;description&gt;${description}&lt;/description&gt; - &lt;language&gt;en&lt;/language&gt; - &lt;lastBuildDate&gt;${last_build_date}&lt;/lastBuildDate&gt; - &lt;atom:link href=&quot;${self_full_link}&quot; rel=&quot;self&quot; type=&quot;application/rss+xml&quot; /&gt; - ${items} - &lt;/channel&gt; -&lt;/rss&gt; -</code></pre> -<p><code>render_template</code> now gets even more generic and passes a <code>dict</code> to <code>Template.substitute()</code></p> -<pre><code class="language-python">def render_template(tpl_fname, out_fname, subs): - with open(tpl_fname, 'r', encoding='utf-8') as inf: - tmpl = Template(inf.read()) - - out = tmpl.substitute(subs) - - with open(out_fname, 'w', encoding='utf-8') as outf: - outf.write(out) -</code></pre> -<p>And make sure to adjust any usages of <code>render_template</code> that exist.</p> -<pre><code class="language-python">def render_index(posts): - content_html = posts_list_html(posts) - render_template('index.html.tmpl', 'index.html', {'content': content_html}) - -def render_post(fname): - ... - render_template('index.html.tmpl', destpath, {'content': out, 'more_title': ' - ' + title}) -</code></pre> -<p>And now we can hack away at RSS generation:</p> -<pre><code>def render_rss_index(posts): - subs = { - 'site_title': 'cfebs.com', - 'site_link': 'https://cfebs.com', - 'self_full_link': 'https://cfebs.com/index.xml', - 'description': 'Recent content from cfebs.com', - 'last_build_date': 'TODO', - 'items': 'TODO', - } - render_template('index.xml.tmpl', 'index.xml', subs) -</code></pre> -<p>After this initial test and a <code>python3 ./main.py</code> run, we should see xml filled out.</p> -<pre><code>❯ cat ./index.xml -&lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot; standalone=&quot;yes&quot;?&gt; -&lt;rss version=&quot;2.0&quot; xmlns:atom=&quot;http://www.w3.org/2005/Atom&quot;&gt; - &lt;channel&gt; - &lt;title&gt;cfebs.com&lt;/title&gt; - &lt;link&gt;https://cfebs.com&lt;/link&gt; - &lt;description&gt;Recent content from cfebs.com&lt;/description&gt; - &lt;language&gt;en&lt;/language&gt; - &lt;lastBuildDate&gt;TODO&lt;/lastBuildDate&gt; - &lt;atom:link href=&quot;https://cfebs.com/index.xml&quot; rel=&quot;self&quot; type=&quot;application/rss+xml&quot; /&gt; - TODO - &lt;/channel&gt; -&lt;/rss&gt; -</code></pre> -<p>Now lets finish up by generating each item entry and collecting them to be replaced in the template.</p> -<pre><code class="language-python">def rss_post_xml(post): - tpl = &quot;&quot;&quot; - &lt;item&gt; - &lt;title&gt;{title}&lt;/title&gt; - &lt;link&gt;{link}&lt;/link&gt; - &lt;pubDate&gt;{pubdate}&lt;/pubDate&gt; - &lt;guid&gt;{link}&lt;/guid&gt; - &lt;description&gt;{description}&lt;/description&gt; - &lt;/item&gt; - &quot;&quot;&quot; - - with open(post['fpath'], 'r') as inf: - text = inf.read() - - md = markdown.Markdown(extensions=['extra', 'meta']) - converted = md.convert(text) - - link = &quot;https://cfebs.com/&quot; + post['destpath'] - pubdate = email.utils.format_datetime(datetime.datetime.fromisoformat(post['date'])) - subs = dict(title=post['title'], link=link, - pubdate=pubdate, - description=converted) - - for k,v in subs.items(): - subs[k] = html.escape(v) - - return tpl.format(**subs) - -def render_rss_index(posts): - items = '' - for post in posts[:5]: - items += rss_post_xml(post) - - subs = { - 'site_title': 'cfebs.com', - 'site_link': 'https://cfebs.com', - 'self_full_link': 'https://cfebs.com/index.xml', - 'description': 'Recent content from cfebs.com', - 'last_build_date': email.utils.format_datetime(datetime.datetime.now()), - } - for k,v in subs.items(): - subs[k] = html.escape(v) - - subs['items'] = items - render_template('index.xml.tmpl', 'index.xml', subs) -</code></pre> -<ul> -<li>Need to use <code>html.escape</code> anywhere we could have quotes or HTML tags in output.</li> -<li><code>posts[:5]</code> should always take the most recent 5 posts to add to the RSS feed.</li> -</ul> -<h2 id="wrapping-up">Wrapping up</h2> -<p>Reached the end of the afternoon, so this is where I'll leave it.</p> -<p>It's not great software.</p> -<ul> -<li>No tests, no docs</li> -<li>Hard coding values like the domain</li> -<li>Using adhoc dicts for generic structures</li> -<li>Relies on system python version and packages.</li> -<li>Does not offer anything a tool like <a href="https://gohugo.io/">hugo</a> does not already offer.</li> -</ul> -<p>But, it's ~150 lines of python with 1 external dependency.</p> -<p>If python or <code>python-markdown</code> drastically changes, it'll probably take 10 minutes to debug.</p> -<p>And - it was fun to write and write about.</p> -<p>View the complete source for generating this blog:</p> -<ul> -<li><a href="https://git.sr.ht/~cfebs/cfebs.srht.site/tree/main/item/main.py">main.py</a></li> -<li><a href="https://git.sr.ht/~cfebs/cfebs.srht.site/tree/main/item/index.html.tmpl">index.html.tmpl</a></li> -<li><a href="https://git.sr.ht/~cfebs/cfebs.srht.site/tree/main/item/index.xml.tmpl">index.xml.tmpl</a></li> -</ul> -<p>Or the full repo tree: <a href="https://git.sr.ht/~cfebs/cfebs.srht.site/tree">https://git.sr.ht/~cfebs/cfebs.srht.site/tree</a></p> - - - - diff --git a/posts/build_a_blog.md b/posts/build_a_blog.md index 349168c..aaf5097 100644 --- a/posts/build_a_blog.md +++ b/posts/build_a_blog.md @@ -3,9 +3,17 @@ Date: 2024-06-17T14:46:36-04:00 --- I want to share my thought process for how to go about building a static blog generator from scratch. -The goal is to take 1 afternoon + caffeine + some DIY spirit → _something_ resembling a static site/blog generator. +There will be nothing ground breaking here - in fact this software will not be good. So turn back now if you're expecting the new [Hugo][hugo]. -Lets see how hard this will be. Here's what a blog is/requirements: +Actually you should probably stop reading and just use [Hugo][Hugo]. + +In case you are still interested, the goal is to take 1 afternoon + caffeine + some DIY spirit → _something_ resembling a static site/blog generator. + +And I hope by the end of this post you might be inspired to build your own generation scripts, maybe in a new language you always wanted to try. + +Lets see how hard this will be. + +Here are the requirements for this blog: * Generate an index with recent list of posts. * Generate each individual post written in markdown -> html @@ -23,10 +31,12 @@ So there is 1 "exotic" feature in parsing/rendering Markdown as HTML that will n The rest is just file and string manipulation. -Most scripting languages would be fine tools for this task. But how to handle Markdown? +Lets get it on. ## Picking the tool for the job +Most scripting languages would be fine tools for this task. But how to handle Markdown? + I've had [Crystal][1] in the back of my mind for this task. It is a nice general purpose language that included Markdown in the stdlib! But unfortunately Markdown was removed in [0.31.0][2]. Other than that, I'm not sure any other languages include a well rounded Markdown implementation out of the box. I'll likely end up building the site in docker with an alpine image down the road, so just a quick search in alpines repos to see what could be useful: @@ -645,3 +655,4 @@ Or the full repo tree: [4]: https://python-markdown.github.io/ [5]: https://archlinux.org/packages/extra/any/python-markdown/ [hugo]: https://gohugo.io/ +[jekyll]: https://gohugo.io/