Obsidian Notes
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

24 KiB

coryd.dev

20–25 minutes

I've talked about building my own music scrobbler, I've talked about improving it and I've complained about wanting to stream my own music and then I wrote a retrospective about it. I've settled on something that works, so let's look at it in a bit more detail.

Table of contents

  1. Why build this?
  2. Data collection
  3. Displaying the data
  4. The rest of the data
  5. Ok that's enough / Conclusion

Why build this?

I can, so I must.[1]

I get to own the data: it sits there, pristine in a table over at Supabase. I can add properties, run migrations, count the rows by hand — all the things you'd normally do with raw data.

My metadata remains exactly as I've defined it. It lines up from Plex directly to what's displayed on my site. Genres, artist images — a little tedious at first, but I tend to be pretty neurotic and find the consistency satisfying.

Routing my listening through Plex means I get the benefits of streaming music without the restrictions. Apple's not deduplicating my tracks and making a mess of the genres. The album I was listening to yesterday isn't gone today because a licensing deal changed. I'm listening to You'd Prefer an Astronaut not You'd Prefer an Astronaut (Deluxe) (Remastered + Bonus tracks - 2022).

I own all the music I listen to. I pay artists for it (directly — whenever I can). It's on a hard drive that's backed up to B2 and GCP. The music directory is linked to Google Drive too.

Last.fm does exist. I've got nothing but fond memories and I still reference it's recommendations sometimes. It feels abandoned (or — at the very least — neglected). I'm also wary of the fact that it gets dragged around with Paramount as they continue to change hands (are they going to unplug that server rack? Use it for AI training data? Do the latter then the former? Who knows).

ListenBrainz exists too. It's quite nice and it's operated by the fine folks at MusicBrainz. They're wonderful and I send my listen data their way too.

I own the files, I own the data and I control the experience. On we go.

Data collection

When I started out with this I was collecting listening data into giant JSON blobs stored in Netlify's blob storage. I'm a front end developer. I like JSON. This was a terrible idea. I mean, it worked but it's one of those things you force to work and then realize you've made a terrible mistake.

A good friend had been recommending Supabase. I didn't think I had a use for it. Turns out I was wrong.

I'd been mirroring listens to ListenBrainz, so I dumped my history from there to JSON, wrote a node script I'm not proud of and imported it into a proper database. Phew. I was nowhere near done (I was near the entrance of the rabbit hole at that point).

Again — me, front end developer — I got to learn about foreign key relationships and all sorts of database mechanics. It's funny how trying to build something you're passionate about makes learning fun. It's also worth noting that working with (making mistakes with) and configuring Directus on top of Supabase was a helpful experience. Seeing how the former configured the latter provided quite a bit of insight I'm still grateful for.

Databases are cool. Use the right tool for the right job and all that. Just because you're good with a hammer doesn't mean it should replace a table saw. Whatever.

Alright, we're storing data. Plex makes sending it convenient because Plex offers webhooks and they even tag the event type. Webhooks get sent off to a Cloudflare Worker, the worker watches for media.scrobble events (see — super handy). When I get a media.scrobble event, I write a insert a listen row to my listens table. Listens are connected to the albums table using a key that consists of artist-name-album-name. They're connected to the artist table by explicitly matching the artist name. Fragile? Maybe, but that's the metadata available in the webhook payload.

Let's step through the worker (I've added handy dandy comments to the code — or "Cory I don't care about code, skip this part"):

import { createClient } from '@supabase/supabase-js'
import { DateTime } from 'luxon'
import slugify from 'slugify'

/*
	We use the native normalize method to get strings with diacritics and other characters removed that make constructing keys difficult. We replace problematic characters with some poorly written regexes and then slugify the string (with some more sanitization — redundant? Perhaps).
*/
const sanitizeMediaString = (str) => {
  const sanitizedString = str
    .normalize('NFD')
    .replace(/[\u0300-\u036f\u2010\-\.\?\(\)\[\]\{\}]/g, '')
    .replace(/\.{3}/g, '')
  return slugify(sanitizedString, {
    replacement: '-',
    remove: /[#,&,+()$~%.'":*?<>{}]/g,
    lower: true,
  })
}

/*
	I route email notifications through forwardemail.net. My email addresses are named creatively.
*/
const sendEmail = async (subject, text, authHeader, maxRetries = 3) => {
  const emailData = new URLSearchParams({
    from: 'hi@admin.coryd.dev',
    to: 'hi@coryd.dev',
    subject: subject,
    text: text,
  }).toString()

  let attempt = 0
  let success = false

  while (attempt < maxRetries && !success) {
    attempt++
    try {
      const response = await fetch('https://api.forwardemail.net/v1/emails', {
        method: 'POST',
        headers: {
          'Content-Type': 'application/x-www-form-urlencoded',
          'Authorization': authHeader,
        },
        body: emailData,
      })

      if (!response.ok) {
        const responseText = await response.text()
        console.error(`Attempt ${attempt}: Email API response error:`, response.status, responseText)
        throw new Error(`Failed to send email: ${responseText}`)
      }

      console.log('Email sent successfully on attempt', attempt)
      success = true
    } catch (error) {
      console.error(`Attempt ${attempt}: Error sending email:`, error.message)

      if (attempt < maxRetries) {
        console.log(`Retrying email send (attempt ${attempt + 1}/${maxRetries})...`)
      } else {
        console.error('All attempts to send email failed.')
      }
    }
  }

  return success
}

export default {
  async fetch(request, env) {
    const SUPABASE_URL = env.SUPABASE_URL
    const SUPABASE_KEY = env.SUPABASE_KEY
    const FORWARDEMAIL_API_KEY = env.FORWARDEMAIL_API_KEY
    const ACCOUNT_ID_PLEX = env.ACCOUNT_ID_PLEX
    const supabase = createClient(SUPABASE_URL, SUPABASE_KEY)
    const authHeader = 'Basic ' + btoa(`${FORWARDEMAIL_API_KEY}:`)
    const url = new URL(request.url)
    const params = url.searchParams
    const id = params.get('id')

    if (!id) return new Response(JSON.stringify({ status: 'Bad request' }), {
      headers: { 'Content-Type': 'application/json' },
    })

    if (id !== ACCOUNT_ID_PLEX) return new Response(JSON.stringify({ status: 'Forbidden' }), {
      headers: { 'Content-Type': 'application/json' },
    })

    const contentType = request.headers.get('Content-Type') || ''
    if (!contentType.includes('multipart/form-data')) return new Response(
      JSON.stringify({
        status: 'Bad request',
        message: 'Invalid Content-Type. Expected multipart/form-data.',
      }),
      { headers: { 'Content-Type': 'application/json' } }
    )

    try {
      const data = await request.formData()
      const payload = JSON.parse(data.get('payload'))

		/*
			There's that event we talked about earlier.
		*/
      if (payload?.event === 'media.scrobble') {
        const artistName = payload['Metadata']['grandparentTitle']
        const albumName = payload['Metadata']['parentTitle']
        const trackName = payload['Metadata']['title']
        const listenedAt = Math.floor(DateTime.now().toSeconds())
        const artistKey = sanitizeMediaString(artistName)
        const albumKey = `${artistKey}-${sanitizeMediaString(albumName)}`

		/*
			Use the payload data to see if the artist for the listen exists. If it doesn't, email myself with the artist name and pertinent metadata that I'll need to correct the record or populate it.
		*/
        let { data: artistData, error: artistError } = await supabase
          .from('artists')
          .select('*')
          .ilike('name_string', artistName)
          .single()

        if (artistError && artistError.code === 'PGRST116') {
          const { error: insertArtistError } = await supabase
            .from('artists')
            .insert([
              {
                mbid: null,
                art: '4cef75db-831f-4f5d-9333-79eaa5bb55ee',
                name: artistName,
                tentative: true,
                total_plays: 0,
              },
            ])

          if (insertArtistError) {
            console.error('Error inserting artist: ', insertArtistError.message)
            return new Response(
              JSON.stringify({
                status: 'error',
                message: insertArtistError.message,
              }),
              { headers: { 'Content-Type': 'application/json' } }
            )
          }

          await sendEmail(
            'New tentative artist record',
            `A new tentative artist record was inserted:\n\nArtist: ${artistName}\nKey: ${artistKey}`,
            authHeader
          )

          ;({ data: artistData, error: artistError } = await supabase
            .from('artists')
            .select('*')
            .ilike('name_string', artistName)
            .single())
        }

        if (artistError) {
          console.error('Error fetching artist:', artistError.message)
          return new Response(
            JSON.stringify({ status: 'error', message: artistError.message }),
            { headers: { 'Content-Type': 'application/json' } }
          )
        }
        
		/* 
			The same thing we did for artists, but for albums. The value assigned to `art` is a placeholder image so that these temporary records don't yield broken images.
		*/
        let { data: albumData, error: albumError } = await supabase
          .from('albums')
          .select('*')
          .ilike('key', albumKey)
          .single()

        if (albumError && albumError.code === 'PGRST116') {
          const { error: insertAlbumError } = await supabase
            .from('albums')
            .insert([
              {
                mbid: null,
                art: '4cef75db-831f-4f5d-9333-79eaa5bb55ee',
                key: albumKey,
                name: albumName,
                tentative: true,
                total_plays: 0,
                artist: artistData.id,
              },
            ])

          if (insertAlbumError) {
            console.error('Error inserting album:', insertAlbumError.message)
            return new Response(
              JSON.stringify({
                status: 'error',
                message: insertAlbumError.message,
              }),
              { headers: { 'Content-Type': 'application/json' } }
            )
          }

          await sendEmail(
            'New tentative album record',
            `A new tentative album record was inserted:\n\nAlbum: ${albumName}\nKey: ${albumKey}\nArtist: ${artistName}`,
            authHeader
          )

          ;({ data: albumData, error: albumError } = await supabase
            .from('albums')
            .select('*')
            .ilike('key', albumKey)
            .single())
        }

        if (albumError) {
          console.error('Error fetching album:', albumError.message)
          return new Response(
            JSON.stringify({ status: 'error', message: albumError.message }),
            { headers: { 'Content-Type': 'application/json' } }
          )
        }
        
		/*
			Insert the listen. Finally.
		*/
        const { error: listenError } = await supabase.from('listens').insert([
          {
            artist_name: artistData['name_string'] || artistName,
            album_name: albumData['name'] || albumName,
            track_name: trackName,
            listened_at: listenedAt,
            album_key: albumKey,
          },
        ])

        if (listenError) {
          console.error('Error inserting listen:', listenError.message)
          return new Response(
            JSON.stringify({ status: 'error', message: listenError.message }),
            { headers: { 'Content-Type': 'application/json' } }
          )
        }

        console.log('Listen record inserted successfully')
      }

      return new Response(JSON.stringify({ status: 'success' }), {
        headers: { 'Content-Type': 'application/json' },
      })
    } catch (e) {
      console.error('Error processing request:', e.message)
      return new Response(
        JSON.stringify({ status: 'error', message: e.message }),
        { headers: { 'Content-Type': 'application/json' } }
      )
    }
  },
}

Displaying the data

I display a bunch (but not all) of this data. I stop displaying charts at 3 months. I could display — say — all time artist, album and track views, but I'd rather keep build times predictable (rather than letting them slowly grow forever). The root music page is available by clicking the headphones in my site's primary navigation. Or you can click here. Don't scroll up — just go.

We've got some text, data's interpolated into it — artists, albums, tracks, genres and some calls to action. The last line with the emoji is the only dynamic part — the last played or track currently being played.

Below that, we've got grids of artists, albums, track charts and albums I'm looking forward to[2].

So, how is all of this fetched? Well, I use Directus to manage my site, so I use the API for that, right? No, no — it's all queries from Supabase directly using their SDK.

The frontend of my site is built using 11ty and all of the queries are performed in data files at build time. They were querying the tables directly (I'm a frontend developer — sorry), they are now querying optimized views. Let's look at artists.js (or don't):

import { createClient } from '@supabase/supabase-js'
import { sanitizeMediaString, parseCountryField } from '../../config/utilities/index.js'

const SUPABASE_URL = process.env.SUPABASE_URL
const SUPABASE_KEY = process.env.SUPABASE_KEY
const supabase = createClient(SUPABASE_URL, SUPABASE_KEY)
const PAGE_SIZE = 1000

/*
	Page through the artist results until we've queried all of the records. Right now, there are 613 (less than a page for our purposes). We get nearly all of the data associated with the artists and return it.
*/
const fetchAllArtists = async () => {
  let artists = []
  let rangeStart = 0

  while (true) {
    const { data, error } = await supabase
      .from('optimized_artists')
      .select(`
        id,
        mbid,
        name_string,
        tentative,
        total_plays,
        country,
        description,
        favorite,
        genre,
        emoji,
        tattoo,
        art,
        albums,
        concerts,
        books,
        movies,
        posts,
        related_artists,
        shows
      `)
      .range(rangeStart, rangeStart + PAGE_SIZE - 1)

    if (error) {
      console.error('Error fetching artists:', error)
      break
    }

    artists = artists.concat(data)
    if (data.length < PAGE_SIZE) break
    rangeStart += PAGE_SIZE
  }

  return artists
}

/*
	This used to be more concise, but I went down the road of connecting media types together. Artists were — as previously discussed — connected to albums. But I read books about music, musicians show up in movies, or TV shows, or posts — or they've got related artists (every black metal band in Iceland shares members with one another.)
*/
const processArtists = (artists) => {
  return artists.map(artist => ({
    id: artist['id'],
    mbid: artist['mbid'],
    name: artist['name_string'],
    tentative: artist['tentative'],
    totalPlays: artist['total_plays'],
    country: parseCountryField(artist['country']),
    description: artist['description'],
    favorite: artist['favorite'],
    genre: artist['genre'],
    emoji: artist['emoji'],
    tattoo: artist['tattoo'],
    image: artist['art'] ? `/${artist['art']}` : '',
    url: `/music/artists/${sanitizeMediaString(artist['name_string'])}-${sanitizeMediaString(parseCountryField(artist['country']))}`,
    albums: (artist['albums'] || []).map(album => ({
      id: album['id'],
      name: album['name'],
      releaseYear: album['release_year'],
      totalPlays: album['total_plays'],
      art: album.art ? `/${album['art']}` : ''
    })).sort((a, b) => a['release_year'] - b['release_year']),
    concerts: artist['concerts']?.[0]?.['id'] ? artist['concerts'].sort((a, b) => new Date(b['date']) - new Date(a['date'])) : null,
    books: artist['books']?.[0]?.['id'] ? artist['books'].map(book => ({
      title: book['title'],
      author: book['author'],
      isbn: book['isbn'],
      description: book['description'],
      url: `/books/${book['isbn']}`,
    })).sort((a, b) => a['title'].localeCompare(b['title'])) : null,
    movies: artist['movies']?.[0]?.['id'] ? artist['movies'].map(movie => ({
      title: movie['title'],
      year: movie['year'],
      tmdb_id: movie['tmdb_id'],
      url: `/watching/movies/${movie['tmdb_id']}`,
    })).sort((a, b) => b['year'] - a['year']) : null,
    shows: artist['shows']?.[0]?.['id'] ? artist['shows'].map(show => ({
      title: show['title'],
      year: show['year'],
      tmdb_id: show['tmdb_id'],
      url: `/watching/shows/${show['tmdb_id']}`,
    })).sort((a, b) => b['year'] - a['year']) : null,
    posts: artist['posts']?.[0]?.['id'] ? artist['posts'].map(post => ({
      id: post['id'],
      title: post['title'],
      date: post['date'],
      slug: post['slug'],
      url: post['slug'],
    })).sort((a, b) => new Date(b['date']) - new Date(a['date'])) : null,
    relatedArtists: artist['related_artists']?.[0]?.['id'] ? artist['related_artists'].map(relatedArtist => {
      relatedArtist['url'] = `/music/artists/${sanitizeMediaString(relatedArtist['name'])}-${sanitizeMediaString(parseCountryField(relatedArtist['country']))}`
      return relatedArtist
    }).sort((a, b) => a['name'].localeCompare(b['name'])) : null,
  }))
}

export default async function () {
  try {
    const artists = await fetchAllArtists()
    return processArtists(artists)
  } catch (error) {
    console.error('Error fetching and processing artists data:', error)
    return []
  }
}

My whole site has slowly been integrated with Directus and Supabase. If you'd like to see all of the data files, take a look at the source for my site's frontend. The artists.js is relatively concise but — I hope — illustrative.

Plex to Cloudflare, Cloudflare to Supabase, Supabase to Directus, Supabase to 11ty. Round and round we go.

I started out looking to track music and I ended up moving all of the content for my site into a CMS. I've got music in here — why not books? Movies? TV? Posts? Pages? robots.txt? Uhhh...links? Yeah links.

Anyways. All of this is built hourly over at Cloudflare. The only network call is for the now playing web component. If you've got JavaScript disabled it'll show the last played track at the time the site was built. Why make it static? Well — I don't need to see this data live. I don't think any visitors do either. If I need to review something I'll pop into Directus.

The rest of the data

Artist bios have been bootstrapped and painstakingly edited to include links, markdown formatting and media relationships. I like links (they're all over this post). Genre data is sourced from Wikipedia and each description links out to the appropriate page.

Artist images are from — all over — I guess.

Album images are dumped from file tags. I throw them in a shared album that I point Apple's photos screensaver at. Shifting tiles of album art.

When I add a new artist, it goes like this:

  1. Add their name.
  2. Associate the appropriate genre.
  3. Add an artist image.
  4. Enter the appropriate country code.
  5. Add a bio and format it.
  6. Add their MusicBrainz ID — this is an autocomplete field in Directus that queries the MusicBrainz API.
  7. Optionally: mark them as a favorite, add an emoji (or combination thereof) that show up when I'm listening to them, indicate whether I've got a tattoo inspired by them and associate any related media on the site.
  8. Tag the artist's music using Meta.
  9. Export the album art.
  10. Add the album art to my screensaver.
  11. Add the artist's music to Plex.
  12. Add the artist image and genre's that match my site.
  13. Add albums for the artist.
    1. Add an album name.
    2. Add the album key.
    3. Add the release MusicBrainz ID (this field also queries their API).
    4. Add album art.
    5. Add the artist's name string that connects to the listens table.
    6. Add the release year.
    7. Repeat for the next album.
  14. Enjoy.

Conclusion

Is this worth doing? Should I do it? I dunno — does it sound worth it? I'm thrilled with it. It's not a novel application so much as it is a composition of parts. My site's primarily a blog but it also has, well, all of this built into it. It's very much a personal site. It's also a sisyphean task.

Spend time on what you enjoy — I've spent a lot on this, I've learned a whole bunch and I'm quite happy with the result. I'm paying for some more infrastructure, but I'm also not paying for The Storygraph (which is a very nice service), Trakt, Last.fm, Letterboxd etc. etc. Tradeoffs.

If I've enjoyed something, it lands in a section in my site. If I share it, I share the link. It's on me to keep that link alive and preserve that. I like that quite a bit too.

I love music. I've built a site that reflects that. Thanks for reading and happy listening.


  1. I'm kidding — really. ↩

  2. These are associated with artists, tagged with a release date and link and rendered — I generate a calendar subscription and feeds for them too. ↩