648 lines
19 KiB
Markdown
648 lines
19 KiB
Markdown
|
Title: Build-a-blog
|
|||
|
Date: 2024-06-17T14:46:36-04:00
|
|||
|
---
|
|||
|
I want to share my thought process for how to go about building a static blog generator from scratch.
|
|||
|
|
|||
|
The goal is to take 1 afternoon + caffeine + some DIY spirit → _something_ resembling a static site/blog generator.
|
|||
|
|
|||
|
Lets see how hard this will be. Here's what a blog is/requirements:
|
|||
|
|
|||
|
* Generate an index with recent list of posts.
|
|||
|
* Generate each individual post written in markdown -> html
|
|||
|
* Support some metadata in each post
|
|||
|
* A post title should have a slug
|
|||
|
* Generate RSS
|
|||
|
|
|||
|
That boils down to:
|
|||
|
|
|||
|
1. Read some files
|
|||
|
2. Parse markdown, maybe parse a header with some key/values.
|
|||
|
3. Template strings
|
|||
|
|
|||
|
So there is 1 "exotic" feature in parsing/rendering Markdown as HTML.
|
|||
|
|
|||
|
The rest is just file and string manipulation.
|
|||
|
|
|||
|
Most scripting languages would be fine tools for this task. But how to handle Markdown?
|
|||
|
|
|||
|
## Picking the tool for the job
|
|||
|
|
|||
|
I've had [Crystal][1] in the back of my mind for this task. It is a nice general purpose language that included Markdown in the stdlib! But unfortunately Markdown was removed in [0.31.0][2]. Other than that, I'm not sure any other languages include a well rounded Markdown implementation out of the box.
|
|||
|
|
|||
|
I'll likely be building the site in docker with an alpine image, so just a quick search in alpines repos to see what could be useful:
|
|||
|
|
|||
|
```shell
|
|||
|
❯ docker run --rm -it alpine
|
|||
|
/ # apk update
|
|||
|
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/main/x86_64/APKINDEX.tar.gz
|
|||
|
fetch https://dl-cdn.alpinelinux.org/alpine/v3.18/community/x86_64/APKINDEX.tar.gz
|
|||
|
v3.18.6-263-g77db018514d [https://dl-cdn.alpinelinux.org/alpine/v3.18/main]
|
|||
|
v3.18.6-263-g77db018514d [https://dl-cdn.alpinelinux.org/alpine/v3.18/community]
|
|||
|
OK: 20079 distinct packages available
|
|||
|
/ # apk search markdown
|
|||
|
discount-2.2.7c-r1
|
|||
|
discount-dev-2.2.7c-r1
|
|||
|
discount-libs-2.2.7c-r1
|
|||
|
kdepim-addons-23.04.3-r0
|
|||
|
markdown-1.0.1-r3
|
|||
|
markdown-doc-1.0.1-r3
|
|||
|
py3-docstring-to-markdown-0.12-r1
|
|||
|
py3-docstring-to-markdown-pyc-0.12-r1
|
|||
|
py3-html2markdown-0.1.7-r3
|
|||
|
py3-html2markdown-pyc-0.1.7-r3
|
|||
|
py3-markdown-3.4.3-r1
|
|||
|
py3-markdown-it-py-2.2.0-r1
|
|||
|
py3-markdown-it-py-pyc-2.2.0-r1
|
|||
|
py3-markdown-pyc-3.4.3-r1
|
|||
|
```
|
|||
|
|
|||
|
[`py3-markdown` in alpine][3] is the popular [`python-markdown`][4]. It's mature and available as a package in my [home distro][5].
|
|||
|
|
|||
|
With that, we should have the exotic Markdown dependency figured out.
|
|||
|
|
|||
|
## Let's build
|
|||
|
|
|||
|
First, lets read 1 post file and render some html.
|
|||
|
|
|||
|
We'll store posts in `posts/` like `posts/build_a_blog.md`.
|
|||
|
|
|||
|
And we'll store the HTML output in the same directory: `posts/build_a_blog.html`.
|
|||
|
|
|||
|
```python
|
|||
|
import re
|
|||
|
import logging
|
|||
|
|
|||
|
import markdown
|
|||
|
destpath_re = re.compile(r'\.md$')
|
|||
|
|
|||
|
logging.basicConfig(encoding='utf-8', level=logging.INFO)
|
|||
|
|
|||
|
def render_post(fpath):
|
|||
|
destpath = destpath_re.sub('.html', fpath)
|
|||
|
logging.info("opening %s for parsing, dest %s", fpath, destpath)
|
|||
|
# from: https://python-markdown.github.io/reference/
|
|||
|
with open(fpath, "r", encoding="utf-8") as input_file:
|
|||
|
logging.info("reading %s", fpath)
|
|||
|
text = input_file.read()
|
|||
|
|
|||
|
logging.info("parsing %s", fpath)
|
|||
|
out = markdown.markdown(text)
|
|||
|
|
|||
|
with open(destpath, "w", encoding="utf-8", errors="xmlcharrefreplace") as output_file:
|
|||
|
logging.info("writing to %s", destpath)
|
|||
|
output_file.write(out)
|
|||
|
|
|||
|
if __name__ == '__main__':
|
|||
|
render_post('posts/build_a_blog.md')
|
|||
|
```
|
|||
|
|
|||
|
And if we run it.
|
|||
|
|
|||
|
```shell
|
|||
|
❯ python3 ./main.py
|
|||
|
INFO:root:opening posts/build_a_blog.md for parsing, dest posts/build_a_blog.html
|
|||
|
INFO:root:reading posts/build_a_blog.md
|
|||
|
INFO:root:parsing posts/build_a_blog.md
|
|||
|
INFO:root:writing to posts/build_a_blog.html
|
|||
|
```
|
|||
|
|
|||
|
Looking pretty good.
|
|||
|
|
|||
|
❯ head posts/build_a_blog.html
|
|||
|
<h1>Build-a-blog</h1>
|
|||
|
<p>I want to share my thought process for how one would go about building a static blog generator from scratch.</p>
|
|||
|
<ul>
|
|||
|
<li>Generate an index with recent list of posts.</li>
|
|||
|
<li>Generate each individual post written in markdown -> html<ul>
|
|||
|
<li>Support some metadata in each post</li>
|
|||
|
<li>A post title should have a slug</li>
|
|||
|
</ul>
|
|||
|
</li>
|
|||
|
<li>Generate RSS</li>
|
|||
|
|
|||
|
Now lets do this for all `.md` files in `posts/`
|
|||
|
|
|||
|
```python
|
|||
|
import glob
|
|||
|
...
|
|||
|
|
|||
|
def render_posts():
|
|||
|
files = glob.glob('posts/*.md')
|
|||
|
logging.info('found post files %s', files)
|
|||
|
for fname in files:
|
|||
|
render_post(fname)
|
|||
|
|
|||
|
if __name__ == '__main__':
|
|||
|
render_posts()
|
|||
|
```
|
|||
|
|
|||
|
And add another simple test post
|
|||
|
|
|||
|
```shell
|
|||
|
❯ echo '# A new post' > ./posts/a_new_post.md
|
|||
|
❯ python3 ./main.py
|
|||
|
INFO:root:found post files ['posts/a_new_post.md', 'posts/build_a_blog.md']
|
|||
|
INFO:root:opening posts/a_new_post.md for parsing, dest posts/a_new_post.html
|
|||
|
INFO:root:reading posts/a_new_post.md
|
|||
|
INFO:root:parsing posts/a_new_post.md
|
|||
|
INFO:root:writing to posts/a_new_post.html
|
|||
|
INFO:root:opening posts/build_a_blog.md for parsing, dest posts/build_a_blog.html
|
|||
|
INFO:root:reading posts/build_a_blog.md
|
|||
|
INFO:root:parsing posts/build_a_blog.md
|
|||
|
INFO:root:writing to posts/build_a_blog.html
|
|||
|
❯ head ./posts/a_new_post.html
|
|||
|
<h1>A new post</h1>
|
|||
|
```
|
|||
|
|
|||
|
Basically at this point, it's a blog generator!
|
|||
|
|
|||
|
But I want a few more features:
|
|||
|
|
|||
|
* Want the posts listed in the index sorted by date.
|
|||
|
* Want each post to be templated in some html wrapper.
|
|||
|
|
|||
|
## Post ordering and templating
|
|||
|
|
|||
|
`python-markdown` supports metadata embedded in posts: <https://python-markdown.github.io/extensions/meta_data/>
|
|||
|
|
|||
|
I thought I'd need to build something here, but turns out it's exactly what I need to assign a few extra attributes to a post.
|
|||
|
|
|||
|
We'll adjust our "spec" for posts such that each post must include the following metadata at the top of the file:
|
|||
|
|
|||
|
```txt
|
|||
|
Title: Build-a-blog
|
|||
|
Date: 2024-06-17T14:46:36-04:00
|
|||
|
---
|
|||
|
```
|
|||
|
|
|||
|
And I'd like to insert the `Title` automatically as a `<h1>` tag in each post so I don't have to write it again in the markdown.
|
|||
|
|
|||
|
So first, lets test the metadata and adjust the test blog post.
|
|||
|
|
|||
|
```shell
|
|||
|
❯ head -n4 ./posts/build_a_blog.md
|
|||
|
Title: Build-a-blog
|
|||
|
Date: 2024-06-17T14:46:36-04:00
|
|||
|
---
|
|||
|
```
|
|||
|
|
|||
|
And pop open a python repl to see how this works.
|
|||
|
|
|||
|
```python
|
|||
|
>>> md = markdown.Markdown(extensions = ['meta']); f = open('posts/build_a_blog.md', 'r'); txt = f.read(); out = md.convert(txt); md.Meta
|
|||
|
{'title': ['Build-a-blog'], 'date': ['2024-06-17T14:46:36-04:00']}
|
|||
|
```
|
|||
|
|
|||
|
Looks pretty nice!
|
|||
|
|
|||
|
So first I will adjust the rendering function to prepend a
|
|||
|
|
|||
|
```markdown
|
|||
|
# {title}
|
|||
|
```
|
|||
|
|
|||
|
Line just after we read the file and extract the metadata.
|
|||
|
|
|||
|
```python
|
|||
|
def render_post(fpath):
|
|||
|
...
|
|||
|
|
|||
|
md = markdown.Markdown(extensions = ['meta'])
|
|||
|
|
|||
|
logging.info("parsing %s", fpath)
|
|||
|
out = md.convert(text)
|
|||
|
|
|||
|
title = md.Meta.get('title')[0]
|
|||
|
date = md.Meta.get('date')[0]
|
|||
|
|
|||
|
out = markdown.markdown('# ' + title) + out
|
|||
|
```
|
|||
|
|
|||
|
Finally, lets return a structure that will make other parts of the program aware of the filename that was rendered and the metadata (title, date)
|
|||
|
|
|||
|
|
|||
|
```python
|
|||
|
def render_post(fpath):
|
|||
|
...
|
|||
|
out = markdown.markdown('# ' + title) + out
|
|||
|
|
|||
|
with open(destpath, "w", encoding="utf-8", errors="xmlcharrefreplace") as output_file:
|
|||
|
logging.info("writing to %s", destpath)
|
|||
|
output_file.write(out)
|
|||
|
|
|||
|
return {
|
|||
|
'title': title,
|
|||
|
'date': date,
|
|||
|
'fpath': fpath,
|
|||
|
'destpath': destpath,
|
|||
|
}
|
|||
|
```
|
|||
|
|
|||
|
Now we have what we need to generate a complete index.
|
|||
|
|
|||
|
### Index templating
|
|||
|
|
|||
|
Lets start by defining what our index template file will be.
|
|||
|
|
|||
|
I'll choose `index.html.tmpl` and after rendering we will write to `index.html`.
|
|||
|
|
|||
|
So lets make a function that will take a list of our post structure above and render it in a `<ul>`.
|
|||
|
|
|||
|
```
|
|||
|
from string import Template
|
|||
|
...
|
|||
|
def posts_list_html(posts):
|
|||
|
post_tpl = """<li>
|
|||
|
<a href="{href}">{title}</a>
|
|||
|
<time datetime="{date}">{disp_date}</time>
|
|||
|
</li>"""
|
|||
|
out = '<ul class="blog-posts-list">'
|
|||
|
for post in posts:
|
|||
|
disp_date = datetime.datetime.fromisoformat(post.get('date')).strftime('%Y-%m-%d')
|
|||
|
out += post_tpl.format(href=post.get('destpath'),
|
|||
|
title=post.get('title'),
|
|||
|
date=post.get('date'),
|
|||
|
disp_date=disp_date)
|
|||
|
return out + '</ul>'
|
|||
|
|
|||
|
def render_index(posts):
|
|||
|
fname = 'index.html.tmpl'
|
|||
|
outname = 'index.html'
|
|||
|
|
|||
|
with open(fname, 'r', encoding='utf-8') as inf:
|
|||
|
tmpl = Template(inf.read())
|
|||
|
|
|||
|
posts_html = posts_html(posts)
|
|||
|
|
|||
|
html = tmpl.substitute(posts=posts_html)
|
|||
|
|
|||
|
with open(outname, 'w', encoding='utf-8') as outf:
|
|||
|
outf.write(html)
|
|||
|
```
|
|||
|
|
|||
|
Make sure that `index.html.tmpl` contains a template variable for `${posts}`
|
|||
|
|
|||
|
```shell
|
|||
|
❯ grep -C2 '\${posts}' ./index.html.tmpl
|
|||
|
<div class="col-md-8 col-sm-12">
|
|||
|
<p>Welcome. Something will go here eventually.</p>
|
|||
|
${posts}
|
|||
|
</div>
|
|||
|
<div class="col-md-4 col-sm-12">
|
|||
|
```
|
|||
|
|
|||
|
And we now need to connect `render_posts()` which returns each post that was processed to `render_index()`
|
|||
|
|
|||
|
```python
|
|||
|
def render_posts():
|
|||
|
files = glob.glob('posts/*.md')
|
|||
|
logging.info('found post files %s', files)
|
|||
|
posts = []
|
|||
|
for fname in files:
|
|||
|
p = render_post(fname)
|
|||
|
posts.append(p)
|
|||
|
logging.info('rendered post: %s', p)
|
|||
|
|
|||
|
return posts
|
|||
|
|
|||
|
if __name__ == '__main__':
|
|||
|
posts = render_posts()
|
|||
|
logging.info('rendered posts: %s', posts)
|
|||
|
render_index(posts)
|
|||
|
```
|
|||
|
|
|||
|
And lets run it!
|
|||
|
|
|||
|
```shell
|
|||
|
❯ python3 ./main.py
|
|||
|
INFO:root:found post files ['posts/a_new_post.md', 'posts/build_a_blog.md']
|
|||
|
INFO:root:opening posts/a_new_post.md for parsing, dest posts/a_new_post.html
|
|||
|
INFO:root:reading posts/a_new_post.md
|
|||
|
INFO:root:parsing posts/a_new_post.md
|
|||
|
INFO:root:writing to posts/a_new_post.html
|
|||
|
INFO:root:rendered post: {'title': 'A new post', 'date': '2024-06-17T15:09:26-04:00', 'fpath': 'posts/a_new_post.md', 'destpath': 'posts/a_new_post.html'}
|
|||
|
INFO:root:opening posts/build_a_blog.md for parsing, dest posts/build_a_blog.html
|
|||
|
INFO:root:reading posts/build_a_blog.md
|
|||
|
INFO:root:parsing posts/build_a_blog.md
|
|||
|
INFO:root:writing to posts/build_a_blog.html
|
|||
|
INFO:root:rendered post: {'title': 'Build-a-blog', 'date': '2024-06-17T14:46:36-04:00', 'fpath': 'posts/build_a_blog.md', 'destpath': 'posts/build_a_blog.html'}
|
|||
|
INFO:root:rendered posts: [{'title': 'A new post', 'date': '2024-06-17T15:09:26-04:00', 'fpath': 'posts/a_new_post.md', 'destpath': 'posts/a_new_post.html'}, {'title': 'Build-a-blog', 'date': '2024-06-17T14:46:36-04:00', 'fpath': 'posts/build_a_blog.md', 'destpath': 'posts/build_a_blog.html'}]
|
|||
|
```
|
|||
|
|
|||
|
And check how the output looks:
|
|||
|
```shell
|
|||
|
❯ grep -C4 'blog-posts-list' ./index.html
|
|||
|
</nav>
|
|||
|
<section class="container">
|
|||
|
<div class="row">
|
|||
|
<div class="col-md-8 col-sm-12">
|
|||
|
<ul class="blog-posts-list"><li>
|
|||
|
<a href="posts/a_new_post.html">A new post</a>
|
|||
|
<time datetime="2024-06-17T19:48:17-04:00">2024-06-17</time>
|
|||
|
</li><li>
|
|||
|
<a href="posts/build_a_blog.html">Build-a-blog</a>
|
|||
|
```
|
|||
|
|
|||
|
Not bad!
|
|||
|
|
|||
|
### Post templating
|
|||
|
|
|||
|
I think I want my blog to just maintain the overall layout from the index page and just render the post body where the main post list is.
|
|||
|
|
|||
|
So lets make that template rendering a bit more general.
|
|||
|
|
|||
|
We'll redefine the content area template variable to replace as `${content}` too.
|
|||
|
|
|||
|
```python
|
|||
|
def render_template(tpl_fname, out_fname, content_html):
|
|||
|
with open(tpl_fname, 'r', encoding='utf-8') as inf:
|
|||
|
tmpl = Template(inf.read())
|
|||
|
|
|||
|
html = tmpl.substitute(content=content_html)
|
|||
|
|
|||
|
with open(out_fname, 'w', encoding='utf-8') as outf:
|
|||
|
outf.write(html)
|
|||
|
|
|||
|
def render_index(posts):
|
|||
|
content_html = posts_list_html(posts)
|
|||
|
render_template('index.html.tmpl', 'index.html', content_html)
|
|||
|
outf.write(out)
|
|||
|
```
|
|||
|
|
|||
|
And now adjust where posts are written out.
|
|||
|
|
|||
|
```python
|
|||
|
def render_post(fpath):
|
|||
|
...
|
|||
|
out = markdown.markdown('# ' + title) + out
|
|||
|
logging.info("writing to %s", destpath)
|
|||
|
render_template('index.html.tmpl', destpath, html)
|
|||
|
```
|
|||
|
|
|||
|
After running you should see the each `post/*.html` file where each post file uses the full index template and includes each generated post HTML.
|
|||
|
|
|||
|
### Post sorting
|
|||
|
|
|||
|
With everything wired up now we just need to sort the posts lists by the date metadata.
|
|||
|
|
|||
|
Lets do a bit of python repl sort testing because I never remember `datetime` usage.
|
|||
|
|
|||
|
Lets generate a few nicely formatted ISO date strings for testing.
|
|||
|
|
|||
|
```shell
|
|||
|
❯ date -d'2023-01-01' -Is
|
|||
|
2023-01-01T00:00:00-05:00
|
|||
|
❯ date -Is
|
|||
|
2024-06-17T16:30:35-04:00
|
|||
|
```
|
|||
|
|
|||
|
And make a test array
|
|||
|
|
|||
|
```python
|
|||
|
>>> posts = [{'date': '2023-01-01T00:00:00-05:00'}, {'date': '2024-06-17T16:30:35-04:00'}]
|
|||
|
```
|
|||
|
|
|||
|
With our current script, the older post would be listed first. So lets try a sort.
|
|||
|
|
|||
|
```
|
|||
|
# Double checking datetime parsing
|
|||
|
>>> import datetime
|
|||
|
>>> newer = datetime.datetime.fromisoformat('2024-06-17T16:30:35-04:00')
|
|||
|
datetime.datetime(2024, 6, 17, 16, 30, 35, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=72000)))
|
|||
|
>>> older = datetime.datetime.fromisoformat('2024-06-17T16:30:35-04:00')
|
|||
|
datetime.datetime(2024, 6, 17, 16, 30, 35, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=72000)))
|
|||
|
|
|||
|
# Checking python sorting methods work as expected
|
|||
|
>>> newer.__gt__(older)
|
|||
|
True
|
|||
|
>>> newer.__lt__(older)
|
|||
|
False
|
|||
|
>>> older.__gt__(newer)
|
|||
|
False
|
|||
|
>>> older.__lt__(newer)
|
|||
|
True
|
|||
|
|
|||
|
# Doing the sort
|
|||
|
>>> sorted(posts, key=lambda x: datetime.datetime.fromisoformat(x['date']), reverse=True)
|
|||
|
[{'date': '2024-06-17T16:30:35-04:00'}, {'date': '2023-01-01T00:00:00-05:00'}]
|
|||
|
```
|
|||
|
|
|||
|
Now lets apply this to our posts.
|
|||
|
|
|||
|
```python
|
|||
|
if __name__ == '__main__':
|
|||
|
posts = render_posts()
|
|||
|
logging.info('rendered posts: %s', posts)
|
|||
|
sorted_posts = sorted(posts,
|
|||
|
key=lambda p: datetime.datetime.fromisoformat(p['date']), reverse=True)
|
|||
|
render_index(sorted_posts)
|
|||
|
```
|
|||
|
|
|||
|
### `<title />` Templating
|
|||
|
|
|||
|
The last bit of templating is to make each post `<title>` different.
|
|||
|
|
|||
|
I'll try something like `<title>cfebs.com - ${title}</title>`
|
|||
|
|
|||
|
So `index.html.tmpl`
|
|||
|
|
|||
|
```html
|
|||
|
<title>cfebs.com${more_title}</title>
|
|||
|
```
|
|||
|
|
|||
|
And where we're using the title template `more_title` will default to empty string.
|
|||
|
|
|||
|
```python
|
|||
|
def render_index(posts):
|
|||
|
content_html = posts_list_html(posts)
|
|||
|
render_template('index.html.tmpl', 'index.html', {'content': content_html, 'more_title': ''})
|
|||
|
```
|
|||
|
|
|||
|
But for a post:
|
|||
|
|
|||
|
```python
|
|||
|
def render_post(fpath):
|
|||
|
...
|
|||
|
title = md.Meta.get('title')[0]
|
|||
|
date = md.Meta.get('date')[0]
|
|||
|
|
|||
|
out = markdown.markdown('# ' + title) + out
|
|||
|
|
|||
|
logging.info("writing to %s", destpath)
|
|||
|
render_template('index.html.tmpl', destpath, {'content': out, 'more_title': ' - ' + title})
|
|||
|
```
|
|||
|
|
|||
|
At this point we have functioning blog post generation with templating.
|
|||
|
|
|||
|
|
|||
|
## RSS
|
|||
|
|
|||
|
This should be pretty easy as RSS is just reformatting our blog index list into different XML.
|
|||
|
|
|||
|
The `render_template` function will be useful here with a few more tweaks. So I'll make another template file (based off a reference <https://drewdevault.com/blog/index.xml>)
|
|||
|
|
|||
|
```shell
|
|||
|
# Grab the reference
|
|||
|
❯ curl -sL 'https://drewdevault.com/blog/index.xml' > index.xml.example
|
|||
|
|
|||
|
# After a bit of editing
|
|||
|
❯ cat ./index.xml.tmpl
|
|||
|
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
|
|||
|
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
|
|||
|
<channel>
|
|||
|
<title>${site_title}</title>
|
|||
|
<link>${site_link}</link>
|
|||
|
<description>${description}</description>
|
|||
|
<language>en</language>
|
|||
|
<lastBuildDate>${last_build_date}</lastBuildDate>
|
|||
|
<atom:link href="${self_full_link}" rel="self" type="application/rss+xml" />
|
|||
|
${items}
|
|||
|
</channel>
|
|||
|
</rss>
|
|||
|
```
|
|||
|
|
|||
|
`render_template` now gets even more generic and passes a `dict` to `Template.substitute()`
|
|||
|
|
|||
|
```python
|
|||
|
def render_template(tpl_fname, out_fname, subs):
|
|||
|
with open(tpl_fname, 'r', encoding='utf-8') as inf:
|
|||
|
tmpl = Template(inf.read())
|
|||
|
|
|||
|
out = tmpl.substitute(subs)
|
|||
|
|
|||
|
with open(out_fname, 'w', encoding='utf-8') as outf:
|
|||
|
outf.write(out)
|
|||
|
```
|
|||
|
|
|||
|
And make sure to adjust any usages of `render_template` that exist.
|
|||
|
|
|||
|
```python
|
|||
|
def render_index(posts):
|
|||
|
content_html = posts_list_html(posts)
|
|||
|
render_template('index.html.tmpl', 'index.html', {'content': content_html})
|
|||
|
|
|||
|
def render_post(fname):
|
|||
|
...
|
|||
|
render_template('index.html.tmpl', destpath, {'content': out, 'more_title': ' - ' + title})
|
|||
|
```
|
|||
|
|
|||
|
And now we can hack away at RSS generation:
|
|||
|
|
|||
|
```
|
|||
|
def render_rss_index(posts):
|
|||
|
subs = {
|
|||
|
'site_title': 'cfebs.com',
|
|||
|
'site_link': 'https://cfebs.com',
|
|||
|
'self_full_link': 'https://cfebs.com/index.xml',
|
|||
|
'description': 'Recent content from cfebs.com',
|
|||
|
'last_build_date': 'TODO',
|
|||
|
'items': 'TODO',
|
|||
|
}
|
|||
|
render_template('index.xml.tmpl', 'index.xml', subs)
|
|||
|
```
|
|||
|
|
|||
|
After this initial test and a `python3 ./main.py` run, we should see xml filled out.
|
|||
|
|
|||
|
```
|
|||
|
❯ cat ./index.xml
|
|||
|
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
|
|||
|
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
|
|||
|
<channel>
|
|||
|
<title>cfebs.com</title>
|
|||
|
<link>https://cfebs.com</link>
|
|||
|
<description>Recent content from cfebs.com</description>
|
|||
|
<language>en</language>
|
|||
|
<lastBuildDate>TODO</lastBuildDate>
|
|||
|
<atom:link href="https://cfebs.com/index.xml" rel="self" type="application/rss+xml" />
|
|||
|
TODO
|
|||
|
</channel>
|
|||
|
</rss>
|
|||
|
```
|
|||
|
|
|||
|
Now lets finish up by generating each item entry and collecting them to be replaced in the template.
|
|||
|
|
|||
|
```python
|
|||
|
def rss_post_xml(post):
|
|||
|
tpl = """
|
|||
|
<item>
|
|||
|
<title>{title}</title>
|
|||
|
<link>{link}</link>
|
|||
|
<pubDate>{pubdate}</pubDate>
|
|||
|
<guid>{link}</guid>
|
|||
|
<description>{description}</description>
|
|||
|
</item>
|
|||
|
"""
|
|||
|
|
|||
|
with open(post['fpath'], 'r') as inf:
|
|||
|
text = inf.read()
|
|||
|
|
|||
|
md = markdown.Markdown(extensions=['extra', 'meta'])
|
|||
|
converted = md.convert(text)
|
|||
|
|
|||
|
link = "https://cfebs.com/" + post['destpath']
|
|||
|
pubdate = email.utils.format_datetime(datetime.datetime.fromisoformat(post['date']))
|
|||
|
subs = dict(title=post['title'], link=link,
|
|||
|
pubdate=pubdate,
|
|||
|
description=converted)
|
|||
|
|
|||
|
for k,v in subs.items():
|
|||
|
subs[k] = html.escape(v)
|
|||
|
|
|||
|
return tpl.format(**subs)
|
|||
|
|
|||
|
def render_rss_index(posts):
|
|||
|
items = ''
|
|||
|
for post in posts[:5]:
|
|||
|
items += rss_post_xml(post)
|
|||
|
|
|||
|
subs = {
|
|||
|
'site_title': 'cfebs.com',
|
|||
|
'site_link': 'https://cfebs.com',
|
|||
|
'self_full_link': 'https://cfebs.com/index.xml',
|
|||
|
'description': 'Recent content from cfebs.com',
|
|||
|
'last_build_date': email.utils.format_datetime(datetime.datetime.now()),
|
|||
|
}
|
|||
|
for k,v in subs.items():
|
|||
|
subs[k] = html.escape(v)
|
|||
|
|
|||
|
subs['items'] = items
|
|||
|
render_template('index.xml.tmpl', 'index.xml', subs)
|
|||
|
```
|
|||
|
|
|||
|
* Need to use `html.escape` anywhere we could have quotes or HTML tags in output.
|
|||
|
* `posts[:5]` should always take the most recent 5 posts to add to the RSS feed.
|
|||
|
|
|||
|
## Wrapping up
|
|||
|
|
|||
|
Reached the end of the afternoon, so this is where I'll leave it.
|
|||
|
|
|||
|
It's not great software.
|
|||
|
|
|||
|
* No tests, no docs
|
|||
|
* No input validation
|
|||
|
* Hard coding values like the domain
|
|||
|
* Using adhoc dicts for generic structures
|
|||
|
* Relies on system python version and packages.
|
|||
|
* Does not offer anything a tool like [hugo][hugo] does not already offer.
|
|||
|
|
|||
|
But, it's ~150 lines of python with 1 external dependency.
|
|||
|
|
|||
|
If python or `python-markdown` drastically changes, it'll probably take <10 minutes to debug.
|
|||
|
|
|||
|
And - it was fun to write and write about.
|
|||
|
|
|||
|
View the complete source for generating this blog:
|
|||
|
|
|||
|
* [main.py](https://git.sr.ht/~cfebs/cfebs.srht.site/tree/main/item/main.py)
|
|||
|
* [index.html.tmpl](https://git.sr.ht/~cfebs/cfebs.srht.site/tree/main/item/index.html.tmpl)
|
|||
|
* [index.xml.tmpl](https://git.sr.ht/~cfebs/cfebs.srht.site/tree/main/item/index.xml.tmpl)
|
|||
|
|
|||
|
Or the full repo tree: <https://git.sr.ht/~cfebs/cfebs.srht.site/tree>
|
|||
|
|
|||
|
[1]: https://crystal-lang.org/
|
|||
|
[2]: https://github.com/crystal-lang/crystal/releases/tag/0.31.0
|
|||
|
[3]: https://pkgs.alpinelinux.org/package/edge/main/x86_64/py3-markdown
|
|||
|
[4]: https://python-markdown.github.io/
|
|||
|
[5]: https://archlinux.org/packages/extra/any/python-markdown/
|
|||
|
[hugo]: https://gohugo.io/
|