Skip to content
airbnb

Resource URIs

Use airbnb as a database/sql-style driver so a host program can address Airbnb as airbnb:// URIs.

airbnb is a command line, but the airbnb Go package is also a small driver that makes Airbnb addressable as a resource URI. A host program registers it the way a program registers a database driver with database/sql, then dereferences airbnb:// URIs without knowing anything about how Airbnb is fetched.

The host that does this today is ant, a single binary that puts one URI namespace over a family of site tools. The examples below use ant; any program that links the package gets the same behavior.

Mounting the driver

A host enables the driver with one blank import, exactly like import _ "github.com/lib/pq":

import _ "github.com/tamnd/airbnb-cli/airbnb"

The package's init registers a domain with the scheme airbnb for the hosts www.airbnb.com and airbnb.com. The standalone airbnb binary does not change.

Addressing records

A URI is scheme://authority/id. The resolver types are:

URI What it is
airbnb://room/<id> one listing, keyed by its room id
airbnb://host/<id> a host's public profile
ant get airbnb://room/12345        # the listing record
ant cat airbnb://room/12345        # just the description body
ant get airbnb://host/555          # the host profile
ant url airbnb://room/12345        # the live https URL
ant resolve https://www.airbnb.com/rooms/12345  # a pasted link, back to its URI

room and host are best-effort: from a datacenter they may hit Airbnb's edge wall and report need-auth, the same as the matching commands. There is no API to fall back to. See what anonymous access reaches.

Collections

ls lists the members of a collection. Each list operation has its own authority, so they never shadow one another:

URI What it lists
airbnb://search/<place> stays in a place
airbnb://reviews/<id> a listing's reviews
airbnb://calendar/<id> a listing's availability days
airbnb://experiences/<place> experiences in a place
airbnb://listings/<id> a host's public listings
ant ls airbnb://search/Lake%20Tahoe    # the stays in a place
ant ls airbnb://reviews/12345          # the listing's reviews
ant ls airbnb://listings/555           # the host's listings

Walking the graph

Every record carries explicit edges to the records it points at, so a host can breadth-first crawl the site and write it to disk without scraping URLs out of free text. A resolver edge names a bare field and points at one record; a collection edge carries the parent id under a <name>_ref field and points at a list authority. The edges are:

From Field Edge to
Place search_ref airbnb://search/<name>
Place experiences_ref airbnb://experiences/<name>
Listing room airbnb://room/<id>
Listing host airbnb://host/<id>
Room host_id airbnb://host/<id>
Room reviews_ref airbnb://reviews/<id>
Room calendar_ref airbnb://calendar/<id>
Review room airbnb://room/<id>
Review author_id airbnb://host/<id>
Day room airbnb://room/<id>
Host listings_ref airbnb://listings/<id>

The edges close into one connected graph. A suggestion fans out into both a stay search and an experience search for the place; a search card walks straight through to its full room and its host; a room reaches its host, its reviews, and its calendar; a review reaches its listing and the reviewer's own profile; a host reaches the host's other listings. No node is left without an outward edge, so a crawl started anywhere reaches the rest of the reachable site. Starting from any node, --follow walks these edges:

ant export airbnb://search/Lake%20Tahoe --follow 2 --to ./data  # each listing's room, then its host, reviews, and calendar
ant get airbnb://room/12345
ant cat airbnb://room/12345        # the description body
ant url airbnb://room/12345

Each record is written under its minted URI with its edges intact, so the saved set reconstructs the slice of the site that was reached: the search results, the full listing behind each card, its reviews and calendar, the hosts behind those listings, and the profile of each reviewer.

These edge fields stay out of the table and CSV views (they would be noise in a terminal) but are always present in the JSON and JSONL a host reads.

Why this is the same code

The driver and the binary share one definition per operation. A resolver op answers both airbnb room on the command line and ant get airbnb://room/... through a host, from the same handler and the same client. There is no second implementation to keep in step.