Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions stay-vs-hotel-scout/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
node_modules/
.env
238 changes: 238 additions & 0 deletions stay-vs-hotel-scout/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,238 @@
# Stay vs Hotel Scout

**Compare Airbnb, Booking.com & Agoda — all at once, powered by AI.**

Searching for accommodation means juggling three different tabs, three different layouts, and three different pricing formats. Stay vs Hotel Scout fires a TinyFish browser agent at all three platforms simultaneously, streams results back in real time, and then runs a two-stage Gemini AI pipeline to brief you before the search and rank every listing after it — with per-listing reasoning, benefits, and drawbacks.

---

## Architecture

```
┌─────────────────────────────────────────────────────────────┐
│ Browser (Client) │
│ │
│ City + dates + guests + trip type → Search All Platforms │
│ │
│ Stage 1: Trip Briefing card (Gemini, fires immediately) │
│ Stage 2: Live agent iframes per platform (TinyFish) │
│ Stage 3: AI Smart Summary + per-listing ranking (Gemini) │
└─────────────────────┬────────────────────────────────────────┘
│ GET /api/search/live (SSE)
│ POST /api/brief (pre-search)
│ POST /api/rank (post-search)
┌─────────────────────▼────────────────────────────────────────┐
│ Express.js Backend │
│ │
│ /api/search/live │
│ └─ TinyFish SDK ──► Promise.allSettled │
│ client.agent.stream({ url, goal, browser_profile, │
│ proxy_config }) │
│ │ │
│ ├── Agent → Airbnb (stealth + US proxy) │
│ ├── Agent → Booking.com (stealth + US proxy) │
│ └── Agent → Agoda (stealth + US proxy) │
│ │
│ /api/brief ──► Gemini (pre-search trip briefing) │
│ /api/rank ──► Gemini (post-search listing ranking) │
└──────────────────────────────────────────────────────────────┘

No database. No cache. Every search hits live platforms in real time.
```

### TinyFish SDK event flow

```
client.agent.stream({ url, goal })
├── onStreamingUrl → live iframe URL forwarded to client via SSE
└── onComplete
└── RunStatus.COMPLETED → extractListings(event.result)
→ enriched listings → SSE → client
```

### Two-stage Gemini pipeline

```
User hits Search
├── [immediately] POST /api/brief
│ └── Gemini: city context, best platform for trip type,
│ 3 tips, what to prioritise
│ → Trip Briefing card shown during search
└── [after all agents complete] POST /api/rank
└── Gemini: rank all listings by value, rating, fees,
trip-purpose suitability
→ score/10, reasoning, benefits[], drawbacks[]
→ rank numbers + "Why?" button on every card
```

---

## Features

- **3 parallel agents** — Airbnb, Booking.com, and Agoda searched simultaneously
- **Live browser previews** — iframe streams of each agent while it works
- **Pre-search AI briefing** — Gemini gives trip-specific advice before results arrive
- **Post-search AI ranking** — every listing ranked with a score, reasoning, and benefits/drawbacks
- **Smart Summary** — top pick, best budget, and best rated cards at a glance
- **Trip Type selector** — Leisure, Business, Family, Romantic, Budget, or Backpacking; all Gemini analysis is tailored to your purpose
- **City autocomplete** — Google Maps Places API for accurate city names
- **Gemini model fallback** — automatically tries `gemini-2.5-flash-lite` → `gemini-2.5-flash` → `gemini-2.5-pro` if a model is under load

---

## Scraping Flow

1. User fills in city, check-in, check-out, guests, and trip type and clicks Search
2. `/api/search/live` opens an SSE connection and fires `search_start`
3. `/api/brief` fires immediately — Gemini returns trip briefing while agents work
4. Three TinyFish agents launch in parallel via `Promise.allSettled`, each with stealth browser profile and US proxy
5. `onStreamingUrl` events forward live iframe URLs to the client as agents start
6. Each agent handles cookie banners, popups, login prompts, and currency formats automatically
7. `onComplete` + `RunStatus.COMPLETED` → `extractListings` parses the JSON result → enriched with platform metadata → streamed to client
8. After all agents finish, `complete` event fires and `/api/rank` is called with all listings
9. Gemini ranks every listing and returns scores + reasoning → rank numbers and "Why?" buttons appear on every card

---

## TinyFish Agent Goals

Each platform gets a numbered, explicit goal prompt. Example for Booking.com:

```
You are on Booking.com showing hotel search results for ${city},
check-in ${checkIn}, check-out ${checkOut}, ${guests} guest(s).

1. If you see a cookie consent banner, click "Accept" or "Decline" to dismiss it.
2. If you see a sign-in or registration modal, close it using the X button.
3. Scroll down slightly to see the property listing cards.
4. Extract data from the first 5 property cards you can see.
5. For each property extract: name, property_type, price_per_night in USD,
total_price in USD, rating (0-10), review_count, listing_url,
breakfast_included (true/false/null).

Return ONLY a valid JSON array:
[{name, property_type, price_per_night, total_price, rating,
review_count, listing_url, breakfast_included}]
Use null for missing fields.
```

All three agents use `browser_profile: 'stealth'` and `proxy_config: { enabled: true, country_code: 'US' }`.

---

## Gemini Ranking Prompt

After all listings are collected, a slimmed-down version (name, platform, price, rating, fees) is sent to Gemini:

```
You are a travel expert. Analyse these accommodation listings for a
${purpose} trip and return a JSON ranking.

Return ONLY valid JSON:
{
"top_pick": { "name": "...", "platform": "...", "reason": "..." },
"best_budget": { "name": "...", "platform": "...", "reason": "..." },
"best_rated": { "name": "...", "platform": "...", "reason": "..." },
"ranked": [{
"rank": 1, "name": "...", "platform": "...", "score": 8.5,
"summary": "one sentence",
"reasoning": "2-3 sentences considering the trip purpose",
"benefits": ["...", "..."],
"drawbacks": ["...", "..."]
}],
"overall_insight": "2-3 sentence summary for this trip purpose"
}
```

---

## Setup

### Prerequisites

- Node.js 18+
- TinyFish API key — [get one here](https://agent.tinyfish.ai/api-keys)
- Gemini API key — [Google AI Studio](https://aistudio.google.com/app/apikey)
- Google Maps API key (optional, for city autocomplete) — [Google Cloud Console](https://console.cloud.google.com/)

### Environment Variables

Create a `.env` file in the project root:

```env
TINYFISH_API_KEY=your-tinyfish-api-key
GEMINI_API_KEY=your-gemini-api-key
GOOGLE_MAPS_KEY=your-google-maps-key # optional
PORT=3000
```

### Install & Run

```bash
cd stay-vs-hotel-scout
npm install
npm run build
npm run dev
```

Open [http://localhost:3000](http://localhost:3000)

### Scripts

| Script | Description |
|--------|-------------|
| `npm run dev` | Start Express server with `--watch` hot-reload |
| `npm run build` | Build React frontend with Parcel into `public/` |
| `npm run watch` | Watch and rebuild frontend on file changes |

---

## Project Structure

```
stay-vs-hotel-scout/
├── src/
│ ├── index.html # Entry point
│ ├── index.css # Tailwind v4 + brand colour
│ ├── types.ts # Shared TypeScript interfaces
│ ├── App.tsx # Root component — state, SSE, Gemini calls
│ └── components/
│ ├── SearchForm.tsx # City autocomplete, dates, guests, trip type
│ ├── PlatformCard.tsx # Per-platform column with live iframe
│ ├── ListingCard.tsx # Single listing — price, rating, rank, Why? toggle
│ └── SmartSummary.tsx # AI summary — top pick, budget, rated, full ranking
├── lib/
│ ├── platforms.js # Platform configs — URL builders + agent goals
│ └── helpers.js # extractListings, sanitizeInput, calcNights
├── server.js # Express — SSE search, /api/brief, /api/rank
├── .postcssrc # Tailwind v4 PostCSS config
├── tsconfig.json
└── package.json
```

---

## Tech Stack

| Layer | Technology |
|-------|-----------|
| Frontend | React 18 + TypeScript, Tailwind CSS v4, Parcel |
| Backend | Express.js, Node.js |
| Browser Agents | TinyFish SDK (`client.agent.stream`) |
| AI | Google Gemini (`@google/generative-ai`) |
| City Autocomplete | Google Maps Places API |
| Streaming | Server-Sent Events (SSE) |

---

## Environment Variables Reference

| Variable | Required | Description |
|----------|----------|-------------|
| `TINYFISH_API_KEY` | Yes | Browser agent access |
| `GEMINI_API_KEY` | Yes | Pre-search briefing + post-search ranking |
| `GOOGLE_MAPS_KEY` | No | City autocomplete in the search form |
| `PORT` | No | Server port (default: 3000) |
78 changes: 78 additions & 0 deletions stay-vs-hotel-scout/lib/helpers.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
export function extractListings(raw) {
if (!raw) return [];
try {
const arr = findArray(raw);
if (!arr) return [];
return arr
.filter((item) => item && typeof item === 'object')
.map((item) => ({
name: String(item.name ?? ''),
property_type: item.property_type ? String(item.property_type) : null,
price_per_night: toNumber(item.price_per_night),
total_price: toNumber(item.total_price),
rating: toNumber(item.rating),
review_count: toNumber(item.review_count),
listing_url: item.listing_url ? String(item.listing_url) : null,
cleaning_fee: toNumber(item.cleaning_fee),
service_fee: toNumber(item.service_fee),
breakfast_included: item.breakfast_included != null ? Boolean(item.breakfast_included) : null,
member_price: toNumber(item.member_price),
}))
.slice(0, 5);
} catch {
return [];
}
}

// Walk the result object (any depth, any key) looking for the first non-empty array
// of objects that looks like listings.
function findArray(val) {
if (Array.isArray(val) && val.length > 0 && typeof val[0] === 'object') return val;

if (typeof val === 'string') {
// Strip markdown code fences the agent sometimes wraps around JSON
const cleaned = val.replace(/^```(?:json)?\s*/i, '').replace(/\s*```$/, '').trim();
try {
const parsed = JSON.parse(cleaned);
const result = findArray(parsed);
if (result) return result;
} catch {}
return null;
}

if (val && typeof val === 'object' && !Array.isArray(val)) {
// Try well-known keys first
for (const key of ['output', 'result', 'data', 'listings', 'results', 'answer', 'content']) {
if (val[key] !== undefined) {
const found = findArray(val[key]);
if (found) return found;
}
}
// Fall back to scanning all values
for (const v of Object.values(val)) {
const found = findArray(v);
if (found) return found;
}
}

return null;
}

function toNumber(val) {
if (val == null) return null;
const n = parseFloat(String(val).replace(/[^0-9.]/g, ''));
return isNaN(n) ? null : n;
}

export function sanitizeInput(input, maxLength = 100) {
return String(input).replace(/[<>"'`]/g, '').trim().slice(0, maxLength);
}

export function calcNights(checkIn, checkOut) {
const ms = new Date(checkOut).getTime() - new Date(checkIn).getTime();
return Math.max(1, Math.round(ms / 86_400_000));
}

export function isValidDate(d) {
return /^\d{4}-\d{2}-\d{2}$/.test(d) && !isNaN(new Date(d).getTime());
}
55 changes: 55 additions & 0 deletions stay-vs-hotel-scout/lib/platforms.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
export const PLATFORMS = {
airbnb: {
name: 'Airbnb',
type: 'airbnb',
searchUrl: (city, checkIn, checkOut, guests) =>
`https://www.airbnb.com/s/${encodeURIComponent(city)}/homes?checkin=${checkIn}&checkout=${checkOut}&adults=${guests}`,
goal: (city, checkIn, checkOut, guests) =>
`Search Airbnb for accommodations in ${city} from ${checkIn} to ${checkOut} for ${guests} guest(s). ` +
`Dismiss any popups or banners. Find the first 5 available listings. ` +
`For each listing extract: name, property_type (Entire home/Private room/etc), price_per_night in USD, ` +
`total_price in USD for the full stay, rating (number out of 5), review_count, listing_url, and any fees mentioned (cleaning_fee, service_fee). ` +
`Return ONLY a JSON array: [{name, property_type, price_per_night, total_price, rating, review_count, listing_url, cleaning_fee, service_fee}]. Use null for missing fields.`,
browserProfile: 'stealth',
proxyConfig: { enabled: true, country_code: 'US' },
},

booking: {
name: 'Booking.com',
type: 'hotel',
searchUrl: (city, checkIn, checkOut, guests) => {
const [inYear, inMonth, inDay] = checkIn.split('-');
const [outYear, outMonth, outDay] = checkOut.split('-');
return (
`https://www.booking.com/searchresults.html?ss=${encodeURIComponent(city)}` +
`&checkin_year=${inYear}&checkin_month=${inMonth}&checkin_monthday=${inDay}` +
`&checkout_year=${outYear}&checkout_month=${outMonth}&checkout_monthday=${outDay}` +
`&group_adults=${guests}&no_rooms=1`
);
},
goal: (city, checkIn, checkOut, guests) =>
`Search Booking.com for hotels in ${city} checking in ${checkIn} and checking out ${checkOut} for ${guests} guest(s). ` +
`Dismiss any popups, cookie banners, or sign-in prompts. Find the first 5 available properties sorted by default. ` +
`For each property extract: name, property_type (Hotel/Apartment/Hostel/etc), price_per_night in USD, ` +
`total_price in USD for the full stay, rating (number out of 10), review_count, listing_url, and breakfast_included (true/false). ` +
`Return ONLY a JSON array: [{name, property_type, price_per_night, total_price, rating, review_count, listing_url, breakfast_included}]. Use null for missing fields.`,
browserProfile: 'stealth',
proxyConfig: { enabled: true, country_code: 'US' },
},

agoda: {
name: 'Agoda',
type: 'hotel',
searchUrl: (city, checkIn, checkOut, guests) =>
`https://www.agoda.com/search?city=${encodeURIComponent(city)}&checkIn=${checkIn}&checkOut=${checkOut}&adults=${guests}&rooms=1`,
goal: (city, checkIn, checkOut, guests) =>
`Search Agoda for hotels in ${city} checking in ${checkIn} and checking out ${checkOut} for ${guests} guest(s). ` +
`Dismiss any popups, cookie banners, or sign-in prompts. Wait for the hotel listings to fully load. ` +
`Find the first 5 available hotel listings shown on the page. ` +
`For each hotel extract: name, property_type, price_per_night in USD, total_price in USD for the full stay, ` +
`rating (number out of 10), review_count, listing_url. ` +
`Return ONLY a JSON array: [{name, property_type, price_per_night, total_price, rating, review_count, listing_url}]. Use null for missing fields.`,
browserProfile: 'stealth',
proxyConfig: { enabled: true, country_code: 'US' },
},
};
Loading