
In this series, we will take a close look at the architecture of ActiveStorage for Rails.
In this first part, we will examine how ActiveStorage treats uploaded data and how to extend this process. The second part will explore how to augment the presentation of uploaded assets.
But first, let's quickly define what ActiveStorage does.
What Is ActiveStorage for Ruby on Rails?
Without recounting the entire ActiveStorage documentation, in a nutshell, ActiveStorage is an adapter to various forms of storing (mostly) user-generated files in your Ruby on Rails application in a straightforward way. The available storage backends can be divided into:
- Local disk storage
- Diverse flavors of cloud storage (most prominently Amazon S3 and compatible object storage providers)
Note: For the sake of completeness, there are also adapters to directly store binary data in your database, like active_storage-postgresql.
ActiveStorage allows you to transparently attach files to database records for easy access. You have probably already come across this API:
class User < ApplicationRecord has_one_attached :avatar end class Post < ApplicationRecord has_many_attached :images end
Here, the class methods has_one_attached
and has_many_attached
are responsible for wiring up your User
or Post
records with the respective attachments. But how does this magic work?
Under the hood, ActiveStorage uses two database tables generated when you install it. Here are their entries from the database schema definition:
create_table "active_storage_attachments", force: :cascade do |t| t.string "name", null: false t.string "record_type", null: false t.bigint "record_id", null: false t.bigint "blob_id", null: false t.datetime "created_at", null: false t.index ["blob_id"], name: "index_active_storage_attachments_on_blob_id" t.index ["record_type", "record_id", "name", "blob_id"], name: "index_active_storage_attachments_uniqueness", unique: true end create_table "active_storage_blobs", force: :cascade do |t| t.string "key", null: false t.string "filename", null: false t.string "content_type" t.text "metadata" t.string "service_name", null: false t.bigint "byte_size", null: false t.string "checksum" t.datetime "created_at", null: false t.index ["key"], name: "index_active_storage_blobs_on_key", unique: true end
The first table belongs to the join model ActiveStorage::Attachment
. It references a polymorphic record
(that's why the methods has_(one|many)_attached
can be called from any ActiveRecord model) as well as a Blob
. As you might correctly assume, this refers to the second table, which is mapped by the ActiveStorage::Blob
model. This acts as a data container for all that is needed to upload or download a file to one of the configured services. Let's take it apart a bit:
- The
key
attribute refers to how the blob is called at its actual storage location (in most cases, an S3 bucket). filename
is the original name of the file under which it was uploaded.content_type
is its analyzed MIME type.metadata
is a generic text column that can hold any metadata of the file, in JSON format. It's this column that our custom analyzers will make use of.service_name
is used to identify the service inconfig/storage.yml
to which this file was uploaded.byte_size
andchecksum
are exactly what you'd expect these attributes to store.
How Does ActiveStorage Treat Uploaded Data?
With these general concepts out of the way, let's examine the ingest process used by ActiveStorage. In other words, once a file is uploaded, what happens next?
The answer lies in the Attachment
model's code. In an after_create_commit
callback, it enqueues a job to analyze the blob it pertains to later.
The code that's used to perform the analysis reads simple enough:
def analyze update! metadata: metadata.merge(extract_metadata_via_analyzer) end # ... private def extract_metadata_via_analyzer analyzer.metadata.merge(analyzed: true) end def analyzer analyzer_class.new(self) end def analyzer_class ActiveStorage.analyzers.detect { |klass| klass.accept?(self) } || ActiveStorage::Analyzer::NullAnalyzer end
We observe that, as indicated before, the analyze
method updates the metadata
column of the blob. This metadata is extracted by an analyzer, but there's an important detail hidden in how exactly this analyzer is provided. The analyzer_class
is retrieved from a list of analyzers that the ActiveStorage
module itself keeps. Let's take a brief look at it in the Rails console:
(dev)> ActiveStorage.analyzers => [ActiveStorage::Analyzer::ImageAnalyzer::Vips, ActiveStorage::Analyzer::ImageAnalyzer::ImageMagick, ActiveStorage::Analyzer::VideoAnalyzer, ActiveStorage::Analyzer::AudioAnalyzer]
Listed here are all the standard analyzers that ship with ActiveStorage: two for images (because of the two processing backends, Vips and ImageMagick), one for video, and one for audio data. From this list, the first analyzer that responds to accept?(self)
with true
is selected. So, for example, ImageAnalyzer
checks whether a blob holds an image, and so on. Creating new analyzers, then, only needs a class that extends Analyzer and is prepended to this list. We'll explore how to do this in the remainder of this article.
Two Use Cases for Custom Analyzers
When would you reach for a custom analyzer? I contend that most use cases revolve around needs for enhanced presentation of assets. To back up that claim, I have prepared two prototypical examples:
- precomputing audio waveform data
- calculating image blurhashes
Extracting and Storing Audio Sample Data
Let's assume that we are building a directory of songs. We might use a Song
model, which has an attached recording
for this:
class Song < ApplicationRecord has_one_attached :recording end
If we create a new record of this model and attach an audio file, what happens? In a full-stack scenario, we would use an upload form, but for the sake of investigating the analysis process, let's just create one from the Rails console:
(dev)> song = Song.create(title: "Ruby Blues in D Flat") (dev)> song.recording.attach(io: File.open("/path/to/file"), filename: "ruby_blues.wav")
Apart from the usual SQL log, Rails also informs us that it has enqueued a job to process the data:
Enqueued ActiveStorage::AnalyzeJob (Job ID: 2a923033-b0d2-4bfc-b368-d3eb1344e64b) to Async(default) with arguments: #<GlobalID:0x000000012292f428 @uri=#<URI::GID gid://active-storage-analyzers-previews/ActiveStorage::Blob/1>>
Let's now inspect the respective Attachment
and Blob
records:
(dev)> song.recording_attachment => #<ActiveStorage::Attachment:0x00000001240c8eb8 id: 1, name: "recording", record_type: "Song", record_id: 1, blob_id: 1, created_at: "2025-06-15 15:35:59.470664000 +0000"> (dev)> song.recording_blob => #<ActiveStorage::Blob:0x00000001235c6f60 id: 1, key: "oi7xszss3y6kfer601zvxfpl1muz", filename: "ruby_blues.wav", content_type: "audio/x-wav", metadata: {"identified" => true}, service_name: "local", byte_size: 35765766, checksum: "ddUPZtkz3hqVI9wUGOwp4g==", created_at: "2025-06-15 15:35:59.461367000 +0000">>
Aha! The file has been correctly identified as being of type audio/x-wav
, but apart from that, there's no other metadata being persisted. We're here to change that.
First, let's write our custom analyzer. We'll put it in the lib/active_storage
directory, and call it ActiveStorage::WaveformAnalyzer
. To draw upon the existing implementation, we'll inherit from ActiveStorage::Analyzer::AudioAnalyzer
.
We saw above that the metadata
method is responsible for returning the appropriate data, so we'll override it. We have to be careful to call super
and merge any new data into what the parent class already provides.
# lib/active_storage/waveform_analyzer.rb module ActiveStorage class WaveformAnalyzer < ActiveStorage::Analyzer::AudioAnalyzer def metadata super.merge waveform end def waveform rms_values = [] download_blob_to_tempfile do |file| IO.popen([ ffmpeg_path, "-i", file.path, "-ac", "1", "-f", "f32le", "-" ], "rb") do |io| frame_size = 4 # mono, 4 bytes (float32) chunk_size = 512 * frame_size # 512 frames while chunk = io.read(chunk_size) floats = chunk.unpack("e*") # little-endian float32 next if floats.empty? rms = Math.sqrt(floats.sum { _1 ** 2 } / floats.size) rms_values << rms end end end # optionally store as Base64 packed string to save space # { waveform: [ rms_values.pack("e*") ].pack("m0") } { waveform: rms_values } end def ffmpeg_path ActiveStorage.paths[:ffmpeg] || "ffmpeg" end end end
The real meat, though, is of course the computation of waveform datapoints. Because ActiveStorage depends on it, we can utilize ffmpeg
for our purposes. The full call we pass to IO.popen
here reads like this:
ffmpeg -i FILE_TO_ANALYZE -ac 1 -f f32le -
Here, -ac 1
tells ffmpeg to mix the audio down to one channel, while -f f32le
specifies "float 32-bit little endian" as the output format. The final dash -
instructs ffmpeg to output this to STDOUT, so we can actually pick it up in the block passed to IO.popen
.
There's one important signal processing detail to mention: storing each and every sample as metadata wouldn't be very efficient — in fact, it would result in storing the entire audio file as a JSON array. Instead, we perform some data compression here by calculating the root mean square over a frame. That's only a fancy way of saying we're calculating the average of 512 (an arbitrary number, but mostly a power of 2) samples, but we square it and subsequently take the square root, because audio sample data can be both positive and negative.
Our new analyzer isn't yet wired up to be used by the Rails application, so we do that in an initializer:
# config/initializers/active_storage.rb require_relative "../../lib/active_storage/waveform_analyzer.rb" Rails.application.config.active_storage.analyzers.prepend ActiveStorage::WaveformAnalyzer
It's important to prepend
it to the list here, because, as we've observed above, the first analyzer in the list that accepts audio files will be used.
Let's put it to use in the Rails console again:
(dev)> song = Song.create(title: "Ruby Blues in D Flat") (dev)> song.recording.attach(io: File.open("/path/to/file"), filename: "ruby_blues.wav") (dev)> song.reload.recording_attachment.metadata[:waveform] => [0.00012394643736996606, 0.00087319937227967, 0.0037783625670793465, 0.005877352246693453, ... etc.]
Now that we have a compressed representation of the audio in our metadata, how can we make use of it? We'll take a look at an idiomatic ActiveStorage method in the second part of this series, but for starters, many JavaScript audio widgets allow you to specify precomputed waveform data, like WaveSurfer does in this example.
Calculating Image Blurhashes in Ruby
Blurhashes are compressed representations of images you can use instead of generic placeholders for an enhanced lazy loading experience. The blurhash Ruby gem provides a straightforward way to encode such strings from images. We add it to our application's dependencies like so:
$ bundle add blurhash
Before we begin writing our custom analyzer, it's important to note that the image processing backend comes in two flavors: Vips or ImageMagick. Since its API is a bit simpler, we'll concentrate on the latter and configure it in config/application.rb
:
config.active_storage.variant_processor = :mini_magick
We can now begin our implementation by subclassing the ActiveStorage::Analyzer::ImageAnalyzer::ImageMagick
base analyzer:
# lib/active_storage/blurhash_analyzer.rb module ActiveStorage class BlurhashAnalyzer < ActiveStorage::Analyzer::ImageAnalyzer::ImageMagick attr_accessor :thumbnail def metadata read_image do |image| build_thumbnail(image) super.merge blurhash end end def blurhash { blurhash: ::Blurhash.encode( thumbnail.width, thumbnail.height, pixels ) } end def build_thumbnail(image) # we scale down the image for faster blurhash processing @thumbnail ||= MiniMagick::Image.open( ::ImageProcessing::MiniMagick.source(image.path).resize_to_limit(200, 200).loader(page: 0).call.path ) end def pixels = @thumbnail.get_pixels.flatten protected def processor = "ImageMagick" end end
What's going on here? We encounter an already familiar pattern: we populate the database column of the same name with more data using the metadata
method. read_image
is just a helper method provided by Rails that opens the image as a file, ready for us to use. The build_thumbnail
method is actually optional, but very helpful for speeding up the blurhash calculation. We simply use MiniMagick
to scale down the image to fit within the bounds of 200x200 pixels. The blurhash
method then employs this method to build the compressed string representation from the downscaled image.
Like above, we prepend it to the list of analyzers in our initializer:
# config/initializers/active_storage.rb require_relative "../../lib/active_storage/waveform_analyzer.rb" require_relative "../../lib/active_storage/blurhash_analyzer.rb" Rails.application.config.active_storage.analyzers.prepend ActiveStorage::WaveformAnalyzer Rails.application.config.active_storage.analyzers.prepend ActiveStorage::BlurhashAnalyzer
Now it's time to test it out. For this, we'll use a test image from picsum.photos:

Let's open a Rails console and attach this image to a post:
(dev)> post = Post.create(title: "Active Storage Analyzers") (dev)> post.images.attach(io: URI.open("https://picsum.photos/id/128/1200/800"), filename: "picsum_128.jpg") # we inspect its metadata we will now find a blurhash representation of the image: (dev)> post.reload.images.first.metadata["blurhash"] => "LWDJS1o#D%kD~qbIIUof%2WARkfP"
For reference, converted to an actual preview image, it looks like this:

To make use of this in our application frontend, the blurhash needs to be decoded and presented. Typically, this involves using the official TypeScript library and a bespoke Stimulus controller. In the second part of this series, we'll take a look at implementing a pure ActiveStorage-generated preview.
That's it for this first part!
Wrap Up
In this article, we've taken a few first steps to customize how ActiveStorage handles and processes media data. We've learned that the ActiveStorage::Blob
model is where the data describing an attachment is stored. When writing bespoke analyzers, it's necessary to put any results in its metadata
column.
We then looked at two examples explaining when and how to write your own custom ActiveStorage analyzers: extracting interleaved waveform data from audio files and calculating image blurhashes. Both implementations demonstrate the diligence that has been put into ActiveStorage's background media processing API: the glue code that is necessary to plug binaries like ffmpeg or ImageMagick into the analysis pipeline is minimal.
In the second and final part of this series, we will reverse this process and examine ways to implement custom ActiveStorage previewers, providing compact graphic representations of the calculated metadata.
Happy coding!
Wondering what you can do next?
Finished this article? Here are a few more things you can do:
- Subscribe to our Ruby Magic newsletter and never miss an article again.
- Start monitoring your Ruby app with AppSignal.
- Share this article on social media
Most popular Ruby articles
What's New in Ruby on Rails 8
Let's explore everything that Rails 8 has to offer.
See moreMeasuring the Impact of Feature Flags in Ruby on Rails with AppSignal
We'll set up feature flags in a Solidus storefront using Flipper and AppSignal's custom metrics.
See moreFive Things to Avoid in Ruby
We'll dive into five common Ruby mistakes and see how we can combat them.
See more

Julian Rubisch
Our guest author Julian is a freelance Ruby on Rails consultant based in Vienna, specializing in Reactive Rails. Part of the StimulusReflex core team, he has been at the forefront of developing cutting-edge HTML-over-the-wire technology since 2020.
All articles by Julian RubischBecome our next author!
AppSignal monitors your apps
AppSignal provides insights for Ruby, Rails, Elixir, Phoenix, Node.js, Express and many other frameworks and libraries. We are located in beautiful Amsterdam. We love stroopwafels. If you do too, let us know. We might send you some!
