Make Your Photo Backup Searchable
Add IPTC keywords, captions, and XMP sidecars to a folder of old photos in one batch
Try PhotoScanr FreeFree to use • No sign-up required • Instant results
By Duncan Rawlinson · Updated
Most photographers think about backups in terms of how many copies of the file exist and how far apart those copies are stored. The 3-2-1 rule. An external drive, a NAS, a cloud bucket. That covers the bits. It does not cover the question that matters years later, which is whether you can find anything.
A photo backup without metadata is a digital shoebox. The pixels survive. The meaning does not. You can scroll through ten thousand thumbnails, but you cannot ask a question and get an answer. Where was that beach we went to in 2018. Which folder has the family reunion shots. What was the name of the bird my dad photographed in his backyard. Without metadata, none of this is searchable.
This guide explains the kinds of metadata that travel with your photos, what survives different operations, and how AI keywording lets you retroactively make a decade old archive searchable in a single afternoon.
Photo metadata is not one thing. There are three overlapping standards, and each one was designed for a different purpose.
EXIF stands for Exchangeable Image File Format. It is the metadata your camera writes when you press the shutter. Date and time. Camera make and model. Lens. ISO, aperture, shutter speed. GPS coordinates if your camera has GPS or you geotag later. EXIF is automatic, factual, and almost always present.
For backups, EXIF is the floor. Even if you never add a single keyword, EXIF tells you when a photo was taken and what equipment shot it. That alone is a lot of searchability.
IPTC stands for International Press Telecommunications Council. It is the metadata standard built for newsrooms and stock libraries. Caption. Headline. Keywords. Creator. Copyright. Location described in human readable terms. Subject codes.
IPTC is the layer that turns a photo from a file into a record. A caption that says "Family hiking the Bruce Trail in October 2018" is what you actually need to find a photo years later. IPTC is also the standard that stock platforms, news agencies, and image search engines all read.
XMP stands for Extensible Metadata Platform. It is Adobe's container format that wraps EXIF, IPTC, and any other metadata into a single XML structure. XMP is what Lightroom, Bridge, and most other modern photo tools actually read and write.
For RAW files, XMP usually lives in a sidecar file alongside the photo. For JPEGs and TIFFs, XMP is embedded inside the file itself. Either way, XMP is the practical container that holds your descriptive metadata.
The question that exposes whether your metadata strategy is working is not "can I find a photo from last week." It is "can I find a photo from 2014." Your memory of last week is fresh. Your memory of 2014 is gone.
For old photos to be findable, the metadata has to answer four questions. When was this taken. Where was it taken. Who or what is in it. What was happening. EXIF answers when. GPS in EXIF answers where if you had it on. The remaining two questions are what IPTC keywords and captions exist for. Without them, your archive is searchable by date and not much else.
If you are looking at a backup with thirty thousand photos and almost no descriptive metadata, do not panic. AI keywording can fill in the gaps retroactively. The next sections cover how.
One of the underappreciated facts about metadata is that it does not always survive when you convert files. Knowing what carries through and what gets stripped is essential for long term archive planning.
Adobe DNG carries through EXIF, IPTC, and XMP cleanly. DNG is one of the better archival formats specifically because it embeds metadata reliably and is a documented standard.
EXIF and IPTC are written into the JPEG by default. XMP is embedded as well. The export dialog has options to include or strip metadata. Make sure you are not stripping it for archival exports.
Most social platforms strip EXIF and IPTC on upload, both for privacy and for file size. The version of your photo that lives on Instagram or Facebook has none of your metadata. This is one of many reasons social platforms are not backups.
The sidecar file holds all of your descriptive metadata. The RAW file holds the pixels. As long as both files travel together with the same base name, the metadata is preserved. If the sidecar is left behind, the metadata is lost.
For backup purposes, the practical rule is to back up the RAW and the XMP sidecar together, or to convert to DNG which embeds both. Never assume a JPEG export is a backup.
The most useful application of AI in photography in 2026 is not generating new images. It is making existing archives searchable. A folder of fifteen thousand vacation, family, and event photos with no keywords can be processed in batches and given full descriptive metadata in a few sessions.
Do not try to keyword your entire archive at once. Start with the photos you want to be able to find. Family events. Significant trips. Work projects. Each of these is a folder you can run through PhotoScanr in a focused session.
For each batch, set style preferences with the year, location, and event type. "These are family photos from the 2017 cottage trip in Muskoka, Ontario." The model uses that context to ground its captions and keywords in the actual situation rather than guessing from visual cues alone.
Lightroom platform mode produces output structured for the IPTC fields that survive in your files. Captions, headlines, keywords, and copyright all land in the right places. ZIP export gives you XMP sidecars you can drop alongside the originals.
Free tier handles five photos per day, which is fine for trying things out but not for archive work. Pro processes one hundred per day with a batch size of twenty five. Studio processes six hundred per day with a batch size of one hundred. Studio is the version 1.22.0 rename of the older tier name, with double the daily quota and double the batch size, and it is the right tier for clearing through an archive.
For more on the keyword side specifically, see the Lightroom keyword strategy guide.
An XMP sidecar is a plain text XML file that lives in the same folder as your RAW file with the same base name and a .xmp extension. IMG_4382.CR3 has IMG_4382.xmp next to it. The sidecar holds your edits, keywords, and captions.
In Lightroom, turn on automatic metadata writing in catalog settings so that any keyword or caption change you make is also written to the sidecar in real time. This is the single most important catalog setting for long term archive integrity.
Software changes. Catalog formats become legacy. Companies get acquired. The only metadata that survives a thirty year horizon is metadata embedded in the file itself or in a documented sidecar standard.
EXIF, IPTC, and XMP are all open standards with extensive documentation. They are readable by ExifTool, which is open source and will run on any platform you can imagine using in the future. If your metadata is in these formats, you have done the right thing for the long term.
Star ratings and color labels you set inside Lightroom are written to XMP if you have automatic writing enabled. Collections and smart collections are catalog only. If a piece of organizational information matters for the long term, find a way to express it as a keyword or IPTC field rather than a Lightroom only construct.
Once a year, pick a random photo from each backup and open it in a tool other than Lightroom. If the keywords and captions are visible in Bridge, Photo Mechanic, or ExifTool, you are in good shape. If they only show up in Lightroom, something is not being written to file, and you need to fix the export settings now while you remember.
Lightroom platform mode in PhotoScanr writes output structured for IPTC keywords, captions, and headlines, with hierarchical keyword support. This is what you want for archive work.
The ZIP export gives you a folder of XMP sidecars matched to your filenames. Drop them next to your originals and the metadata is there for any future tool to read.
For old archive work, use style preferences to give the model the year, place, and event context the photos belong to. This produces keywords and captions that are accurate to the situation rather than generic.
Turn grounding on if your archive has photos with specific landmarks, buildings, or places that need identification. The model can cross check against external sources rather than guessing.
A backup with metadata is a backup that stays useful as years pass. EXIF for the technical record. IPTC for the descriptive record. XMP as the container that holds it all in a portable, open form. AI to fill in the descriptive layer for everything you never had time to keyword by hand.
The work of adding metadata to an old archive feels daunting until you do the first batch and realize how fast it goes when you are not typing every keyword yourself. After a few weekends, the archive that was unsearchable for a decade becomes the most useful library you own.
Compare plans on the pricing page or open the homepage and start with a small folder to see how it works.
Add IPTC keywords, captions, and XMP sidecars to a folder of old photos in one batch
Try PhotoScanr FreeFree to use • No sign-up required • Instant results