build_a_blog: rm index.xml, intro
This commit is contained in:
parent
3519112157
commit
7cdfee36fc
3 changed files with 15 additions and 525 deletions
1
.gitignore
vendored
1
.gitignore
vendored
|
@ -1,2 +1,3 @@
|
||||||
index.html
|
index.html
|
||||||
posts/*.html
|
posts/*.html
|
||||||
|
index.xml
|
||||||
|
|
522
index.xml
522
index.xml
|
@ -1,522 +0,0 @@
|
||||||
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
|
|
||||||
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
|
|
||||||
<channel>
|
|
||||||
<title>cfebs.com</title>
|
|
||||||
<link>https://cfebs.com</link>
|
|
||||||
<description>Recent content from cfebs.com</description>
|
|
||||||
<language>en</language>
|
|
||||||
<lastBuildDate>Mon, 17 Jun 2024 19:49:00 -0000</lastBuildDate>
|
|
||||||
<atom:link href="https://cfebs.com/index.xml" rel="self" type="application/rss+xml" />
|
|
||||||
|
|
||||||
<item>
|
|
||||||
<title>Build-a-blog</title>
|
|
||||||
<link>https://cfebs.com/posts/build_a_blog.html</link>
|
|
||||||
<pubDate>Mon, 17 Jun 2024 14:46:36 -0400</pubDate>
|
|
||||||
<guid>https://cfebs.com/posts/build_a_blog.html</guid>
|
|
||||||
<description><p>I want to share my thought process for how to go about building a static blog generator from scratch.</p>
|
|
||||||
<p>The goal is to take 1 afternoon + caffeine + some DIY spirit → <em>something</em> resembling a static site/blog generator.</p>
|
|
||||||
<p>Lets see how hard this will be. Here's what a blog is/requirements:</p>
|
|
||||||
<ul>
|
|
||||||
<li>Generate an index with recent list of posts.</li>
|
|
||||||
<li>Generate each individual post written in markdown -&gt; html<ul>
|
|
||||||
<li>Support some metadata in each post</li>
|
|
||||||
<li>A post title should have a slug</li>
|
|
||||||
</ul>
|
|
||||||
</li>
|
|
||||||
<li>Generate RSS</li>
|
|
||||||
</ul>
|
|
||||||
<p>That boils down to:</p>
|
|
||||||
<ol>
|
|
||||||
<li>Read some files</li>
|
|
||||||
<li>Parse markdown, maybe parse a header with some key/values.</li>
|
|
||||||
<li>Template strings</li>
|
|
||||||
</ol>
|
|
||||||
<p>So there is 1 "exotic" feature in parsing/rendering Markdown as HTML.</p>
|
|
||||||
<p>The rest is just file and string manipulation.</p>
|
|
||||||
<p>Most scripting languages would be fine tools for this task. But how to handle Markdown?</p>
|
|
||||||
<h2 id="picking-the-tool-for-the-job">Picking the tool for the job</h2>
|
|
||||||
<p>I've had <a href="https://crystal-lang.org/">Crystal</a> in the back of my mind for this task. It is a nice general purpose language that included Markdown in the stdlib! But unfortunately Markdown was removed in <a href="https://github.com/crystal-lang/crystal/releases/tag/0.31.0">0.31.0</a>. Other than that, I'm not sure any other languages include a well rounded Markdown implementation out of the box.</p>
|
|
||||||
<p>I'll likely be building the site in docker with an alpine image, so just a quick search in alpines repos to see what could be useful:</p>
|
|
||||||
<pre><code class="language-shell">❯ docker run --rm -it alpine
|
|
||||||
/ # apk update
|
|
||||||
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/APKINDEX.tar.gz
|
|
||||||
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/community/x86_64/APKINDEX.tar.gz
|
|
||||||
v3.18.6-263-g77db018514d [https://dl-cdn.alpinelinux.org/alpine/v3.18/main]
|
|
||||||
v3.18.6-263-g77db018514d [https://dl-cdn.alpinelinux.org/alpine/v3.18/community]
|
|
||||||
OK: 20079 distinct packages available
|
|
||||||
/ # apk search markdown
|
|
||||||
discount-2.2.7c-r1
|
|
||||||
discount-dev-2.2.7c-r1
|
|
||||||
discount-libs-2.2.7c-r1
|
|
||||||
kdepim-addons-23.04.3-r0
|
|
||||||
markdown-1.0.1-r3
|
|
||||||
markdown-doc-1.0.1-r3
|
|
||||||
py3-docstring-to-markdown-0.12-r1
|
|
||||||
py3-docstring-to-markdown-pyc-0.12-r1
|
|
||||||
py3-html2markdown-0.1.7-r3
|
|
||||||
py3-html2markdown-pyc-0.1.7-r3
|
|
||||||
py3-markdown-3.4.3-r1
|
|
||||||
py3-markdown-it-py-2.2.0-r1
|
|
||||||
py3-markdown-it-py-pyc-2.2.0-r1
|
|
||||||
py3-markdown-pyc-3.4.3-r1
|
|
||||||
</code></pre>
|
|
||||||
<p><a href="https://pkgs.alpinelinux.org/package/edge/main/x86_64/py3-markdown"><code>py3-markdown</code> in alpine</a> is the popular <a href="https://python-markdown.github.io/"><code>python-markdown</code></a>. It's mature and available as a package in my <a href="https://archlinux.org/packages/extra/any/python-markdown/">home distro</a>.</p>
|
|
||||||
<p>With that, we should have the exotic Markdown dependency figured out.</p>
|
|
||||||
<h2 id="lets-build">Let's build</h2>
|
|
||||||
<p>First, lets read 1 post file and render some html.</p>
|
|
||||||
<p>We'll store posts in <code>posts/</code> like <code>posts/build_a_blog.md</code>.</p>
|
|
||||||
<p>And we'll store the HTML output in the same directory: <code>posts/build_a_blog.html</code>.</p>
|
|
||||||
<pre><code class="language-python">import re
|
|
||||||
import logging
|
|
||||||
|
|
||||||
import markdown
|
|
||||||
destpath_re = re.compile(r'\.md$')
|
|
||||||
|
|
||||||
logging.basicConfig(encoding='utf-8', level=logging.INFO)
|
|
||||||
|
|
||||||
def render_post(fpath):
|
|
||||||
destpath = destpath_re.sub('.html', fpath)
|
|
||||||
logging.info(&quot;opening %s for parsing, dest %s&quot;, fpath, destpath)
|
|
||||||
# from: https://python-markdown.github.io/reference/
|
|
||||||
with open(fpath, &quot;r&quot;, encoding=&quot;utf-8&quot;) as input_file:
|
|
||||||
logging.info(&quot;reading %s&quot;, fpath)
|
|
||||||
text = input_file.read()
|
|
||||||
|
|
||||||
logging.info(&quot;parsing %s&quot;, fpath)
|
|
||||||
out = markdown.markdown(text)
|
|
||||||
|
|
||||||
with open(destpath, &quot;w&quot;, encoding=&quot;utf-8&quot;, errors=&quot;xmlcharrefreplace&quot;) as output_file:
|
|
||||||
logging.info(&quot;writing to %s&quot;, destpath)
|
|
||||||
output_file.write(out)
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
render_post('posts/build_a_blog.md')
|
|
||||||
</code></pre>
|
|
||||||
<p>And if we run it.</p>
|
|
||||||
<pre><code class="language-shell">❯ python3 ./main.py
|
|
||||||
INFO:root:opening posts/build_a_blog.md for parsing, dest posts/build_a_blog.html
|
|
||||||
INFO:root:reading posts/build_a_blog.md
|
|
||||||
INFO:root:parsing posts/build_a_blog.md
|
|
||||||
INFO:root:writing to posts/build_a_blog.html
|
|
||||||
</code></pre>
|
|
||||||
<p>Looking pretty good.</p>
|
|
||||||
<pre><code>❯ head posts/build_a_blog.html
|
|
||||||
&lt;h1&gt;Build-a-blog&lt;/h1&gt;
|
|
||||||
&lt;p&gt;I want to share my thought process for how one would go about building a static blog generator from scratch.&lt;/p&gt;
|
|
||||||
&lt;ul&gt;
|
|
||||||
&lt;li&gt;Generate an index with recent list of posts.&lt;/li&gt;
|
|
||||||
&lt;li&gt;Generate each individual post written in markdown -&amp;gt; html&lt;ul&gt;
|
|
||||||
&lt;li&gt;Support some metadata in each post&lt;/li&gt;
|
|
||||||
&lt;li&gt;A post title should have a slug&lt;/li&gt;
|
|
||||||
&lt;/ul&gt;
|
|
||||||
&lt;/li&gt;
|
|
||||||
&lt;li&gt;Generate RSS&lt;/li&gt;
|
|
||||||
</code></pre>
|
|
||||||
<p>Now lets do this for all <code>.md</code> files in <code>posts/</code></p>
|
|
||||||
<pre><code class="language-python">import glob
|
|
||||||
...
|
|
||||||
|
|
||||||
def render_posts():
|
|
||||||
files = glob.glob('posts/*.md')
|
|
||||||
logging.info('found post files %s', files)
|
|
||||||
for fname in files:
|
|
||||||
render_post(fname)
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
render_posts()
|
|
||||||
</code></pre>
|
|
||||||
<p>And add another simple test post</p>
|
|
||||||
<pre><code class="language-shell">❯ echo '# A new post' &gt; ./posts/a_new_post.md
|
|
||||||
❯ python3 ./main.py
|
|
||||||
INFO:root:found post files ['posts/a_new_post.md', 'posts/build_a_blog.md']
|
|
||||||
INFO:root:opening posts/a_new_post.md for parsing, dest posts/a_new_post.html
|
|
||||||
INFO:root:reading posts/a_new_post.md
|
|
||||||
INFO:root:parsing posts/a_new_post.md
|
|
||||||
INFO:root:writing to posts/a_new_post.html
|
|
||||||
INFO:root:opening posts/build_a_blog.md for parsing, dest posts/build_a_blog.html
|
|
||||||
INFO:root:reading posts/build_a_blog.md
|
|
||||||
INFO:root:parsing posts/build_a_blog.md
|
|
||||||
INFO:root:writing to posts/build_a_blog.html
|
|
||||||
❯ head ./posts/a_new_post.html
|
|
||||||
&lt;h1&gt;A new post&lt;/h1&gt;
|
|
||||||
</code></pre>
|
|
||||||
<p>Basically at this point, it's a blog generator!</p>
|
|
||||||
<p>But I want a few more features:</p>
|
|
||||||
<ul>
|
|
||||||
<li>Want the posts listed in the index sorted by date.</li>
|
|
||||||
<li>Want each post to be templated in some html wrapper.</li>
|
|
||||||
</ul>
|
|
||||||
<h2 id="post-ordering-and-templating">Post ordering and templating</h2>
|
|
||||||
<p><code>python-markdown</code> supports metadata embedded in posts: <a href="https://python-markdown.github.io/extensions/meta_data/">https://python-markdown.github.io/extensions/meta_data/</a></p>
|
|
||||||
<p>I thought I'd need to build something here, but turns out it's exactly what I need to assign a few extra attributes to a post.</p>
|
|
||||||
<p>We'll adjust our "spec" for posts such that each post must include the following metadata at the top of the file:</p>
|
|
||||||
<pre><code class="language-txt">Title: Build-a-blog
|
|
||||||
Date: 2024-06-17T14:46:36-04:00
|
|
||||||
---
|
|
||||||
</code></pre>
|
|
||||||
<p>And I'd like to insert the <code>Title</code> automatically as a <code>&lt;h1&gt;</code> tag in each post so I don't have to write it again in the markdown.</p>
|
|
||||||
<p>So first, lets test the metadata and adjust the test blog post.</p>
|
|
||||||
<pre><code class="language-shell">❯ head -n4 ./posts/build_a_blog.md
|
|
||||||
Title: Build-a-blog
|
|
||||||
Date: 2024-06-17T14:46:36-04:00
|
|
||||||
---
|
|
||||||
</code></pre>
|
|
||||||
<p>And pop open a python repl to see how this works.</p>
|
|
||||||
<pre><code class="language-python">&gt;&gt;&gt; md = markdown.Markdown(extensions = ['meta']); f = open('posts/build_a_blog.md', 'r'); txt = f.read(); out = md.convert(txt); md.Meta
|
|
||||||
{'title': ['Build-a-blog'], 'date': ['2024-06-17T14:46:36-04:00']}
|
|
||||||
</code></pre>
|
|
||||||
<p>Looks pretty nice!</p>
|
|
||||||
<p>So first I will adjust the rendering function to prepend a</p>
|
|
||||||
<pre><code class="language-markdown"># {title}
|
|
||||||
</code></pre>
|
|
||||||
<p>Line just after we read the file and extract the metadata.</p>
|
|
||||||
<pre><code class="language-python">def render_post(fpath):
|
|
||||||
...
|
|
||||||
|
|
||||||
md = markdown.Markdown(extensions = ['meta'])
|
|
||||||
|
|
||||||
logging.info(&quot;parsing %s&quot;, fpath)
|
|
||||||
out = md.convert(text)
|
|
||||||
|
|
||||||
title = md.Meta.get('title')[0]
|
|
||||||
date = md.Meta.get('date')[0]
|
|
||||||
|
|
||||||
out = markdown.markdown('# ' + title) + out
|
|
||||||
</code></pre>
|
|
||||||
<p>Finally, lets return a structure that will make other parts of the program aware of the filename that was rendered and the metadata (title, date)</p>
|
|
||||||
<pre><code class="language-python">def render_post(fpath):
|
|
||||||
...
|
|
||||||
out = markdown.markdown('# ' + title) + out
|
|
||||||
|
|
||||||
with open(destpath, &quot;w&quot;, encoding=&quot;utf-8&quot;, errors=&quot;xmlcharrefreplace&quot;) as output_file:
|
|
||||||
logging.info(&quot;writing to %s&quot;, destpath)
|
|
||||||
output_file.write(out)
|
|
||||||
|
|
||||||
return {
|
|
||||||
'title': title,
|
|
||||||
'date': date,
|
|
||||||
'fpath': fpath,
|
|
||||||
'destpath': destpath,
|
|
||||||
}
|
|
||||||
</code></pre>
|
|
||||||
<p>Now we have what we need to generate a complete index.</p>
|
|
||||||
<h3 id="index-templating">Index templating</h3>
|
|
||||||
<p>Lets start by defining what our index template file will be.</p>
|
|
||||||
<p>I'll choose <code>index.html.tmpl</code> and after rendering we will write to <code>index.html</code>.</p>
|
|
||||||
<p>So lets make a function that will take a list of our post structure above and render it in a <code>&lt;ul&gt;</code>.</p>
|
|
||||||
<pre><code>from string import Template
|
|
||||||
...
|
|
||||||
def posts_list_html(posts):
|
|
||||||
post_tpl = &quot;&quot;&quot;&lt;li&gt;
|
|
||||||
&lt;a href=&quot;{href}&quot;&gt;{title}&lt;/a&gt;
|
|
||||||
&lt;time datetime=&quot;{date}&quot;&gt;{disp_date}&lt;/time&gt;
|
|
||||||
&lt;/li&gt;&quot;&quot;&quot;
|
|
||||||
out = '&lt;ul class=&quot;blog-posts-list&quot;&gt;'
|
|
||||||
for post in posts:
|
|
||||||
disp_date = datetime.datetime.fromisoformat(post.get('date')).strftime('%Y-%m-%d')
|
|
||||||
out += post_tpl.format(href=post.get('destpath'),
|
|
||||||
title=post.get('title'),
|
|
||||||
date=post.get('date'),
|
|
||||||
disp_date=disp_date)
|
|
||||||
return out + '&lt;/ul&gt;'
|
|
||||||
|
|
||||||
def render_index(posts):
|
|
||||||
fname = 'index.html.tmpl'
|
|
||||||
outname = 'index.html'
|
|
||||||
|
|
||||||
with open(fname, 'r', encoding='utf-8') as inf:
|
|
||||||
tmpl = Template(inf.read())
|
|
||||||
|
|
||||||
posts_html = posts_html(posts)
|
|
||||||
|
|
||||||
html = tmpl.substitute(posts=posts_html)
|
|
||||||
|
|
||||||
with open(outname, 'w', encoding='utf-8') as outf:
|
|
||||||
outf.write(html)
|
|
||||||
</code></pre>
|
|
||||||
<p>Make sure that <code>index.html.tmpl</code> contains a template variable for <code>${posts}</code></p>
|
|
||||||
<pre><code class="language-shell">❯ grep -C2 '\${posts}' ./index.html.tmpl
|
|
||||||
&lt;div class=&quot;col-md-8 col-sm-12&quot;&gt;
|
|
||||||
&lt;p&gt;Welcome. Something will go here eventually.&lt;/p&gt;
|
|
||||||
${posts}
|
|
||||||
&lt;/div&gt;
|
|
||||||
&lt;div class=&quot;col-md-4 col-sm-12&quot;&gt;
|
|
||||||
</code></pre>
|
|
||||||
<p>And we now need to connect <code>render_posts()</code> which returns each post that was processed to <code>render_index()</code></p>
|
|
||||||
<pre><code class="language-python">def render_posts():
|
|
||||||
files = glob.glob('posts/*.md')
|
|
||||||
logging.info('found post files %s', files)
|
|
||||||
posts = []
|
|
||||||
for fname in files:
|
|
||||||
p = render_post(fname)
|
|
||||||
posts.append(p)
|
|
||||||
logging.info('rendered post: %s', p)
|
|
||||||
|
|
||||||
return posts
|
|
||||||
|
|
||||||
if __name__ == '__main__':
|
|
||||||
posts = render_posts()
|
|
||||||
logging.info('rendered posts: %s', posts)
|
|
||||||
render_index(posts)
|
|
||||||
</code></pre>
|
|
||||||
<p>And lets run it!</p>
|
|
||||||
<pre><code class="language-shell">❯ python3 ./main.py
|
|
||||||
INFO:root:found post files ['posts/a_new_post.md', 'posts/build_a_blog.md']
|
|
||||||
INFO:root:opening posts/a_new_post.md for parsing, dest posts/a_new_post.html
|
|
||||||
INFO:root:reading posts/a_new_post.md
|
|
||||||
INFO:root:parsing posts/a_new_post.md
|
|
||||||
INFO:root:writing to posts/a_new_post.html
|
|
||||||
INFO:root:rendered post: {'title': 'A new post', 'date': '2024-06-17T15:09:26-04:00', 'fpath': 'posts/a_new_post.md', 'destpath': 'posts/a_new_post.html'}
|
|
||||||
INFO:root:opening posts/build_a_blog.md for parsing, dest posts/build_a_blog.html
|
|
||||||
INFO:root:reading posts/build_a_blog.md
|
|
||||||
INFO:root:parsing posts/build_a_blog.md
|
|
||||||
INFO:root:writing to posts/build_a_blog.html
|
|
||||||
INFO:root:rendered post: {'title': 'Build-a-blog', 'date': '2024-06-17T14:46:36-04:00', 'fpath': 'posts/build_a_blog.md', 'destpath': 'posts/build_a_blog.html'}
|
|
||||||
INFO:root:rendered posts: [{'title': 'A new post', 'date': '2024-06-17T15:09:26-04:00', 'fpath': 'posts/a_new_post.md', 'destpath': 'posts/a_new_post.html'}, {'title': 'Build-a-blog', 'date': '2024-06-17T14:46:36-04:00', 'fpath': 'posts/build_a_blog.md', 'destpath': 'posts/build_a_blog.html'}]
|
|
||||||
</code></pre>
|
|
||||||
<p>And check how the output looks:</p>
|
|
||||||
<pre><code class="language-shell">❯ grep -C4 'blog-posts-list' ./index.html
|
|
||||||
&lt;/nav&gt;
|
|
||||||
&lt;section class=&quot;container&quot;&gt;
|
|
||||||
&lt;div class=&quot;row&quot;&gt;
|
|
||||||
&lt;div class=&quot;col-md-8 col-sm-12&quot;&gt;
|
|
||||||
&lt;ul class=&quot;blog-posts-list&quot;&gt;&lt;li&gt;
|
|
||||||
&lt;a href=&quot;posts/a_new_post.html&quot;&gt;A new post&lt;/a&gt;
|
|
||||||
&lt;time datetime=&quot;2024-06-17T19:48:17-04:00&quot;&gt;2024-06-17&lt;/time&gt;
|
|
||||||
&lt;/li&gt;&lt;li&gt;
|
|
||||||
&lt;a href=&quot;posts/build_a_blog.html&quot;&gt;Build-a-blog&lt;/a&gt;
|
|
||||||
</code></pre>
|
|
||||||
<p>Not bad!</p>
|
|
||||||
<h3 id="post-templating">Post templating</h3>
|
|
||||||
<p>I think I want my blog to just maintain the overall layout from the index page and just render the post body where the main post list is.</p>
|
|
||||||
<p>So lets make that template rendering a bit more general.</p>
|
|
||||||
<p>We'll redefine the content area template variable to replace as <code>${content}</code> too.</p>
|
|
||||||
<pre><code class="language-python">def render_template(tpl_fname, out_fname, content_html):
|
|
||||||
with open(tpl_fname, 'r', encoding='utf-8') as inf:
|
|
||||||
tmpl = Template(inf.read())
|
|
||||||
|
|
||||||
html = tmpl.substitute(content=content_html)
|
|
||||||
|
|
||||||
with open(out_fname, 'w', encoding='utf-8') as outf:
|
|
||||||
outf.write(html)
|
|
||||||
|
|
||||||
def render_index(posts):
|
|
||||||
content_html = posts_list_html(posts)
|
|
||||||
render_template('index.html.tmpl', 'index.html', content_html)
|
|
||||||
outf.write(out)
|
|
||||||
</code></pre>
|
|
||||||
<p>And now adjust where posts are written out.</p>
|
|
||||||
<pre><code class="language-python">def render_post(fpath):
|
|
||||||
...
|
|
||||||
out = markdown.markdown('# ' + title) + out
|
|
||||||
logging.info(&quot;writing to %s&quot;, destpath)
|
|
||||||
render_template('index.html.tmpl', destpath, html)
|
|
||||||
</code></pre>
|
|
||||||
<p>After running you should see the each <code>post/*.html</code> file where each post file uses the full index template and includes each generated post HTML.</p>
|
|
||||||
<h3 id="post-sorting">Post sorting</h3>
|
|
||||||
<p>With everything wired up now we just need to sort the posts lists by the date metadata.</p>
|
|
||||||
<p>Lets do a bit of python repl sort testing because I never remember <code>datetime</code> usage.</p>
|
|
||||||
<p>Lets generate a few nicely formatted ISO date strings for testing.</p>
|
|
||||||
<pre><code class="language-shell">❯ date -d'2023-01-01' -Is
|
|
||||||
2023-01-01T00:00:00-05:00
|
|
||||||
❯ date -Is
|
|
||||||
2024-06-17T16:30:35-04:00
|
|
||||||
</code></pre>
|
|
||||||
<p>And make a test array</p>
|
|
||||||
<pre><code class="language-python">&gt;&gt;&gt; posts = [{'date': '2023-01-01T00:00:00-05:00'}, {'date': '2024-06-17T16:30:35-04:00'}]
|
|
||||||
</code></pre>
|
|
||||||
<p>With our current script, the older post would be listed first. So lets try a sort.</p>
|
|
||||||
<pre><code># Double checking datetime parsing
|
|
||||||
&gt;&gt;&gt; import datetime
|
|
||||||
&gt;&gt;&gt; newer = datetime.datetime.fromisoformat('2024-06-17T16:30:35-04:00')
|
|
||||||
datetime.datetime(2024, 6, 17, 16, 30, 35, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=72000)))
|
|
||||||
&gt;&gt;&gt; older = datetime.datetime.fromisoformat('2024-06-17T16:30:35-04:00')
|
|
||||||
datetime.datetime(2024, 6, 17, 16, 30, 35, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=72000)))
|
|
||||||
|
|
||||||
# Checking python sorting methods work as expected
|
|
||||||
&gt;&gt;&gt; newer.__gt__(older)
|
|
||||||
True
|
|
||||||
&gt;&gt;&gt; newer.__lt__(older)
|
|
||||||
False
|
|
||||||
&gt;&gt;&gt; older.__gt__(newer)
|
|
||||||
False
|
|
||||||
&gt;&gt;&gt; older.__lt__(newer)
|
|
||||||
True
|
|
||||||
|
|
||||||
# Doing the sort
|
|
||||||
&gt;&gt;&gt; sorted(posts, key=lambda x: datetime.datetime.fromisoformat(x['date']), reverse=True)
|
|
||||||
[{'date': '2024-06-17T16:30:35-04:00'}, {'date': '2023-01-01T00:00:00-05:00'}]
|
|
||||||
</code></pre>
|
|
||||||
<p>Now lets apply this to our posts.</p>
|
|
||||||
<pre><code class="language-python">if __name__ == '__main__':
|
|
||||||
posts = render_posts()
|
|
||||||
logging.info('rendered posts: %s', posts)
|
|
||||||
sorted_posts = sorted(posts,
|
|
||||||
key=lambda p: datetime.datetime.fromisoformat(p['date']), reverse=True)
|
|
||||||
render_index(sorted_posts)
|
|
||||||
</code></pre>
|
|
||||||
<h3 id="title-templating"><code>&lt;title /&gt;</code> Templating</h3>
|
|
||||||
<p>The last bit of templating is to make each post <code>&lt;title&gt;</code> different.</p>
|
|
||||||
<p>I'll try something like <code>&lt;title&gt;cfebs.com - ${title}&lt;/title&gt;</code></p>
|
|
||||||
<p>So <code>index.html.tmpl</code></p>
|
|
||||||
<pre><code class="language-html">&lt;title&gt;cfebs.com${more_title}&lt;/title&gt;
|
|
||||||
</code></pre>
|
|
||||||
<p>And where we're using the title template <code>more_title</code> will default to empty string.</p>
|
|
||||||
<pre><code class="language-python">def render_index(posts):
|
|
||||||
content_html = posts_list_html(posts)
|
|
||||||
render_template('index.html.tmpl', 'index.html', {'content': content_html, 'more_title': ''})
|
|
||||||
</code></pre>
|
|
||||||
<p>But for a post:</p>
|
|
||||||
<pre><code class="language-python">def render_post(fpath):
|
|
||||||
...
|
|
||||||
title = md.Meta.get('title')[0]
|
|
||||||
date = md.Meta.get('date')[0]
|
|
||||||
|
|
||||||
out = markdown.markdown('# ' + title) + out
|
|
||||||
|
|
||||||
logging.info(&quot;writing to %s&quot;, destpath)
|
|
||||||
render_template('index.html.tmpl', destpath, {'content': out, 'more_title': ' - ' + title})
|
|
||||||
</code></pre>
|
|
||||||
<p>At this point we have functioning blog post generation with templating.</p>
|
|
||||||
<h2 id="rss">RSS</h2>
|
|
||||||
<p>This should be pretty easy as RSS is just reformatting our blog index list into different XML.</p>
|
|
||||||
<p>The <code>render_template</code> function will be useful here with a few more tweaks. So I'll make another template file (based off a reference <a href="https://drewdevault.com/blog/index.xml">https://drewdevault.com/blog/index.xml</a>)</p>
|
|
||||||
<pre><code class="language-shell"># Grab the reference
|
|
||||||
❯ curl -sL 'https://drewdevault.com/blog/index.xml' &gt; index.xml.example
|
|
||||||
|
|
||||||
# After a bit of editing
|
|
||||||
❯ cat ./index.xml.tmpl
|
|
||||||
&lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot; standalone=&quot;yes&quot;?&gt;
|
|
||||||
&lt;rss version=&quot;2.0&quot; xmlns:atom=&quot;http://www.w3.org/2005/Atom&quot;&gt;
|
|
||||||
&lt;channel&gt;
|
|
||||||
&lt;title&gt;${site_title}&lt;/title&gt;
|
|
||||||
&lt;link&gt;${site_link}&lt;/link&gt;
|
|
||||||
&lt;description&gt;${description}&lt;/description&gt;
|
|
||||||
&lt;language&gt;en&lt;/language&gt;
|
|
||||||
&lt;lastBuildDate&gt;${last_build_date}&lt;/lastBuildDate&gt;
|
|
||||||
&lt;atom:link href=&quot;${self_full_link}&quot; rel=&quot;self&quot; type=&quot;application/rss+xml&quot; /&gt;
|
|
||||||
${items}
|
|
||||||
&lt;/channel&gt;
|
|
||||||
&lt;/rss&gt;
|
|
||||||
</code></pre>
|
|
||||||
<p><code>render_template</code> now gets even more generic and passes a <code>dict</code> to <code>Template.substitute()</code></p>
|
|
||||||
<pre><code class="language-python">def render_template(tpl_fname, out_fname, subs):
|
|
||||||
with open(tpl_fname, 'r', encoding='utf-8') as inf:
|
|
||||||
tmpl = Template(inf.read())
|
|
||||||
|
|
||||||
out = tmpl.substitute(subs)
|
|
||||||
|
|
||||||
with open(out_fname, 'w', encoding='utf-8') as outf:
|
|
||||||
outf.write(out)
|
|
||||||
</code></pre>
|
|
||||||
<p>And make sure to adjust any usages of <code>render_template</code> that exist.</p>
|
|
||||||
<pre><code class="language-python">def render_index(posts):
|
|
||||||
content_html = posts_list_html(posts)
|
|
||||||
render_template('index.html.tmpl', 'index.html', {'content': content_html})
|
|
||||||
|
|
||||||
def render_post(fname):
|
|
||||||
...
|
|
||||||
render_template('index.html.tmpl', destpath, {'content': out, 'more_title': ' - ' + title})
|
|
||||||
</code></pre>
|
|
||||||
<p>And now we can hack away at RSS generation:</p>
|
|
||||||
<pre><code>def render_rss_index(posts):
|
|
||||||
subs = {
|
|
||||||
'site_title': 'cfebs.com',
|
|
||||||
'site_link': 'https://cfebs.com',
|
|
||||||
'self_full_link': 'https://cfebs.com/index.xml',
|
|
||||||
'description': 'Recent content from cfebs.com',
|
|
||||||
'last_build_date': 'TODO',
|
|
||||||
'items': 'TODO',
|
|
||||||
}
|
|
||||||
render_template('index.xml.tmpl', 'index.xml', subs)
|
|
||||||
</code></pre>
|
|
||||||
<p>After this initial test and a <code>python3 ./main.py</code> run, we should see xml filled out.</p>
|
|
||||||
<pre><code>❯ cat ./index.xml
|
|
||||||
&lt;?xml version=&quot;1.0&quot; encoding=&quot;utf-8&quot; standalone=&quot;yes&quot;?&gt;
|
|
||||||
&lt;rss version=&quot;2.0&quot; xmlns:atom=&quot;http://www.w3.org/2005/Atom&quot;&gt;
|
|
||||||
&lt;channel&gt;
|
|
||||||
&lt;title&gt;cfebs.com&lt;/title&gt;
|
|
||||||
&lt;link&gt;https://cfebs.com&lt;/link&gt;
|
|
||||||
&lt;description&gt;Recent content from cfebs.com&lt;/description&gt;
|
|
||||||
&lt;language&gt;en&lt;/language&gt;
|
|
||||||
&lt;lastBuildDate&gt;TODO&lt;/lastBuildDate&gt;
|
|
||||||
&lt;atom:link href=&quot;https://cfebs.com/index.xml&quot; rel=&quot;self&quot; type=&quot;application/rss+xml&quot; /&gt;
|
|
||||||
TODO
|
|
||||||
&lt;/channel&gt;
|
|
||||||
&lt;/rss&gt;
|
|
||||||
</code></pre>
|
|
||||||
<p>Now lets finish up by generating each item entry and collecting them to be replaced in the template.</p>
|
|
||||||
<pre><code class="language-python">def rss_post_xml(post):
|
|
||||||
tpl = &quot;&quot;&quot;
|
|
||||||
&lt;item&gt;
|
|
||||||
&lt;title&gt;{title}&lt;/title&gt;
|
|
||||||
&lt;link&gt;{link}&lt;/link&gt;
|
|
||||||
&lt;pubDate&gt;{pubdate}&lt;/pubDate&gt;
|
|
||||||
&lt;guid&gt;{link}&lt;/guid&gt;
|
|
||||||
&lt;description&gt;{description}&lt;/description&gt;
|
|
||||||
&lt;/item&gt;
|
|
||||||
&quot;&quot;&quot;
|
|
||||||
|
|
||||||
with open(post['fpath'], 'r') as inf:
|
|
||||||
text = inf.read()
|
|
||||||
|
|
||||||
md = markdown.Markdown(extensions=['extra', 'meta'])
|
|
||||||
converted = md.convert(text)
|
|
||||||
|
|
||||||
link = &quot;https://cfebs.com/&quot; + post['destpath']
|
|
||||||
pubdate = email.utils.format_datetime(datetime.datetime.fromisoformat(post['date']))
|
|
||||||
subs = dict(title=post['title'], link=link,
|
|
||||||
pubdate=pubdate,
|
|
||||||
description=converted)
|
|
||||||
|
|
||||||
for k,v in subs.items():
|
|
||||||
subs[k] = html.escape(v)
|
|
||||||
|
|
||||||
return tpl.format(**subs)
|
|
||||||
|
|
||||||
def render_rss_index(posts):
|
|
||||||
items = ''
|
|
||||||
for post in posts[:5]:
|
|
||||||
items += rss_post_xml(post)
|
|
||||||
|
|
||||||
subs = {
|
|
||||||
'site_title': 'cfebs.com',
|
|
||||||
'site_link': 'https://cfebs.com',
|
|
||||||
'self_full_link': 'https://cfebs.com/index.xml',
|
|
||||||
'description': 'Recent content from cfebs.com',
|
|
||||||
'last_build_date': email.utils.format_datetime(datetime.datetime.now()),
|
|
||||||
}
|
|
||||||
for k,v in subs.items():
|
|
||||||
subs[k] = html.escape(v)
|
|
||||||
|
|
||||||
subs['items'] = items
|
|
||||||
render_template('index.xml.tmpl', 'index.xml', subs)
|
|
||||||
</code></pre>
|
|
||||||
<ul>
|
|
||||||
<li>Need to use <code>html.escape</code> anywhere we could have quotes or HTML tags in output.</li>
|
|
||||||
<li><code>posts[:5]</code> should always take the most recent 5 posts to add to the RSS feed.</li>
|
|
||||||
</ul>
|
|
||||||
<h2 id="wrapping-up">Wrapping up</h2>
|
|
||||||
<p>Reached the end of the afternoon, so this is where I'll leave it.</p>
|
|
||||||
<p>It's not great software.</p>
|
|
||||||
<ul>
|
|
||||||
<li>No tests, no docs</li>
|
|
||||||
<li>Hard coding values like the domain</li>
|
|
||||||
<li>Using adhoc dicts for generic structures</li>
|
|
||||||
<li>Relies on system python version and packages.</li>
|
|
||||||
<li>Does not offer anything a tool like <a href="https://gohugo.io/">hugo</a> does not already offer.</li>
|
|
||||||
</ul>
|
|
||||||
<p>But, it's ~150 lines of python with 1 external dependency.</p>
|
|
||||||
<p>If python or <code>python-markdown</code> drastically changes, it'll probably take 10 minutes to debug.</p>
|
|
||||||
<p>And - it was fun to write and write about.</p>
|
|
||||||
<p>View the complete source for generating this blog:</p>
|
|
||||||
<ul>
|
|
||||||
<li><a href="https://git.sr.ht/~cfebs/cfebs.srht.site/tree/main/item/main.py">main.py</a></li>
|
|
||||||
<li><a href="https://git.sr.ht/~cfebs/cfebs.srht.site/tree/main/item/index.html.tmpl">index.html.tmpl</a></li>
|
|
||||||
<li><a href="https://git.sr.ht/~cfebs/cfebs.srht.site/tree/main/item/index.xml.tmpl">index.xml.tmpl</a></li>
|
|
||||||
</ul>
|
|
||||||
<p>Or the full repo tree: <a href="https://git.sr.ht/~cfebs/cfebs.srht.site/tree">https://git.sr.ht/~cfebs/cfebs.srht.site/tree</a></p></description>
|
|
||||||
</item>
|
|
||||||
|
|
||||||
</channel>
|
|
||||||
</rss>
|
|
|
@ -3,9 +3,17 @@ Date: 2024-06-17T14:46:36-04:00
|
||||||
---
|
---
|
||||||
I want to share my thought process for how to go about building a static blog generator from scratch.
|
I want to share my thought process for how to go about building a static blog generator from scratch.
|
||||||
|
|
||||||
The goal is to take 1 afternoon + caffeine + some DIY spirit → _something_ resembling a static site/blog generator.
|
There will be nothing ground breaking here - in fact this software will not be good. So turn back now if you're expecting the new [Hugo][hugo].
|
||||||
|
|
||||||
Lets see how hard this will be. Here's what a blog is/requirements:
|
Actually you should probably stop reading and just use [Hugo][Hugo].
|
||||||
|
|
||||||
|
In case you are still interested, the goal is to take 1 afternoon + caffeine + some DIY spirit → _something_ resembling a static site/blog generator.
|
||||||
|
|
||||||
|
And I hope by the end of this post you might be inspired to build your own generation scripts, maybe in a new language you always wanted to try.
|
||||||
|
|
||||||
|
Lets see how hard this will be.
|
||||||
|
|
||||||
|
Here are the requirements for this blog:
|
||||||
|
|
||||||
* Generate an index with recent list of posts.
|
* Generate an index with recent list of posts.
|
||||||
* Generate each individual post written in markdown -> html
|
* Generate each individual post written in markdown -> html
|
||||||
|
@ -23,10 +31,12 @@ So there is 1 "exotic" feature in parsing/rendering Markdown as HTML that will n
|
||||||
|
|
||||||
The rest is just file and string manipulation.
|
The rest is just file and string manipulation.
|
||||||
|
|
||||||
Most scripting languages would be fine tools for this task. But how to handle Markdown?
|
Lets get it on.
|
||||||
|
|
||||||
## Picking the tool for the job
|
## Picking the tool for the job
|
||||||
|
|
||||||
|
Most scripting languages would be fine tools for this task. But how to handle Markdown?
|
||||||
|
|
||||||
I've had [Crystal][1] in the back of my mind for this task. It is a nice general purpose language that included Markdown in the stdlib! But unfortunately Markdown was removed in [0.31.0][2]. Other than that, I'm not sure any other languages include a well rounded Markdown implementation out of the box.
|
I've had [Crystal][1] in the back of my mind for this task. It is a nice general purpose language that included Markdown in the stdlib! But unfortunately Markdown was removed in [0.31.0][2]. Other than that, I'm not sure any other languages include a well rounded Markdown implementation out of the box.
|
||||||
|
|
||||||
I'll likely end up building the site in docker with an alpine image down the road, so just a quick search in alpines repos to see what could be useful:
|
I'll likely end up building the site in docker with an alpine image down the road, so just a quick search in alpines repos to see what could be useful:
|
||||||
|
@ -645,3 +655,4 @@ Or the full repo tree: <https://git.sr.ht/~cfebs/cfebs.srht.site/tree>
|
||||||
[4]: https://python-markdown.github.io/
|
[4]: https://python-markdown.github.io/
|
||||||
[5]: https://archlinux.org/packages/extra/any/python-markdown/
|
[5]: https://archlinux.org/packages/extra/any/python-markdown/
|
||||||
[hugo]: https://gohugo.io/
|
[hugo]: https://gohugo.io/
|
||||||
|
[jekyll]: https://gohugo.io/
|
||||||
|
|
Loading…
Reference in a new issue