Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -49,4 +49,7 @@ logo-manager/package-lock.json

# others
/sample-gsoc-guide/
/things-to-do/
/things-to-do/

# data backups (local safety net, not for version control)
/new-api-details-backup-*/
149 changes: 149 additions & 0 deletions CHANGELOG-2026.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,149 @@
# GSoC 2026 Data Integration — Changelog

## Data Pipeline: 4 new scripts

### `scripts/fetch-year-data.ts`
Fetches raw org data from Google's API. Reusable for any year.
```bash
npx tsx scripts/fetch-year-data.ts --year 2026
```
Output: `new-api-details/yearly/google-summer-of-code-2026-organizations-raw.json`

---

### `scripts/transform-year-organizations.ts`
Reads raw API JSON → updates/creates per-org JSON files → regenerates `index.json` + `metadata.json`.

```bash
npx tsx scripts/transform-year-organizations.ts --year 2026
```

What it does for **returning orgs** (156):
```
- adds 2026 to active_years
- updates last_year to 2026
- sets is_currently_active: true
- merges new technologies/topics (union, no deletions)
- updates contact/social from API
```

What it does for **new orgs** (29):
```
- creates new JSON file with first_year: 2026, active_years: [2026]
- maps Google API fields → internal format
```

What it does for **orgs not in 2026** (48):
```
- sets is_currently_active: false (nothing else changed)
```

`first_time` is **derived** in the index, not stored per-org:
```ts
first_time: data.first_year === YEAR
```

---

### `scripts/generate-yearly-page-from-json.ts`
Produces `new-api-details/yearly/google-summer-of-code-2026.json` from org JSON files. No DB.

```bash
npx tsx scripts/generate-yearly-page-from-json.ts --year 2026
```

Output matches `YearlyPageData` type. Projects array is empty (not yet announced). `finalized: false`.

---

### `scripts/regenerate-tech-topics-from-json.ts`
Rebuilds all tech-stack, topics, and homepage JSON from org files. No DB. Years are derived dynamically from the data.

```bash
npx tsx scripts/regenerate-tech-topics-from-json.ts
```

Regenerated:
- 825 tech-stack JSON files + index (all now include 2026 in `popularity_by_year`)
- 1566 topic JSON files + index (all now include 2026 in `yearlyStats`)
- `homepage.json` (updated metrics: 533 total orgs, 185 active)

---

## UI fixes

### `app/organizations/filters-sidebar.tsx`
Added 2026 to the `YEARS` filter array.
```ts
// before
const YEARS = [2025, 2024, ...]
// after
const YEARS = [2026, 2025, 2024, ...]
```

### `app/yearly/page.tsx`
- Added `{ year: 2026, slug: "google-summer-of-code-2026" }` to `yearlyPages`
- Updated stats: "11" years, "11,000+" projects, "2026" latest
- CTA button now links to 2026

### `app/yearly/[slug]/page.tsx`
Added `{ slug: "google-summer-of-code-2026" }` to `generateStaticParams`.

### `lib/projects-page-types.ts`
Added 2026 to `getAvailableProjectYears()`.

### `package.json`
Added npm scripts:
```json
"gsoc:fetch": "npx tsx scripts/fetch-year-data.ts",
"gsoc:transform": "npx tsx scripts/transform-year-organizations.ts",
"gsoc:yearly": "npx tsx scripts/generate-yearly-page-from-json.ts",
"gsoc:regen": "npx tsx scripts/regenerate-tech-topics-from-json.ts",
"gsoc:sync": "... all four in sequence ..."
```

### `.gitignore`
Added `/new-api-details-backup-*/` to ignore backup folders.

---

## Backup

`new-api-details-backup-pre2026/` — full copy of all data before any 2026 changes. Gitignored.

---

## Data summary

| Metric | Value |
|---|---|
| Total orgs in index | 533 (was 504) |
| Active in 2026 | 185 |
| Returning | 156 |
| First-time | 29 |
| Marked inactive | 48 |
| Projects | 0 (not yet announced) |
| Top language | Python (121 orgs) |
| Years covered | 11 (2016–2026) |

---

## Future year workflow

```bash
npm run gsoc:fetch -- --year 2027
npm run gsoc:transform -- --year 2027
npm run gsoc:yearly -- --year 2027
npm run gsoc:regen
```

Then update 3 hardcoded places:
1. `app/yearly/page.tsx` — `yearlyPages` array
2. `app/yearly/[slug]/page.tsx` — `generateStaticParams`
3. `app/organizations/filters-sidebar.tsx` — `YEARS` array

---

## Important: dev server

Next.js caches JSON imports at startup. After running any script that changes JSON files, **restart the dev server** (`Ctrl+C` then `npm run dev`) for changes to appear.
7 changes: 4 additions & 3 deletions app/organizations/filters-sidebar.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,10 @@ interface FiltersSidebarProps {
onFilterChange: (filters: FilterState) => void
filters: FilterState
availableTechs: Array<{ name: string; count: number }>
firstTimeCount?: number
}

const YEARS = [2025, 2024, 2023, 2022, 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012]
const YEARS = [2026, 2025, 2024, 2023, 2022, 2021, 2020, 2019, 2018, 2017, 2016, 2015, 2014, 2013, 2012]
const CATEGORIES = [
'Artificial Intelligence',
'Data',
Expand All @@ -49,7 +50,7 @@ const TOPICS = [
'Database',
]

export function FiltersSidebar({ onFilterChange, filters, availableTechs }: FiltersSidebarProps) {
export function FiltersSidebar({ onFilterChange, filters, availableTechs, firstTimeCount }: FiltersSidebarProps) {

const [sidebarSearch] = useState('')
const [expandedSections, setExpandedSections] = useState({
Expand Down Expand Up @@ -225,7 +226,7 @@ export function FiltersSidebar({ onFilterChange, filters, availableTechs }: Filt
onChange={toggleFirstTime}
/>
<span className="text-sm text-gray-700 dark:text-foreground">First-time organizations only</span>
<span className="text-xs text-gray-400">(14)</span>
{firstTimeCount !== undefined && <span className="text-xs text-gray-400">({firstTimeCount})</span>}
</label>
</div>
</div>
Expand Down
50 changes: 16 additions & 34 deletions app/organizations/organizations-client.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,10 @@ interface OrganizationsClientProps {
initialData: PaginatedResponse<Organization>
initialPage: number
initialTechs: Array<{ name: string; count: number }>
firstTimeCount?: number
}

export function OrganizationsClient({ initialData, initialPage, initialTechs }: OrganizationsClientProps) {
export function OrganizationsClient({ initialData, initialPage, initialTechs, firstTimeCount }: OrganizationsClientProps) {
const router = useRouter()
const searchParams = useSearchParams()
const [data, setData] = useState<PaginatedResponse<Organization>>(initialData)
Expand All @@ -27,6 +28,14 @@ export function OrganizationsClient({ initialData, initialPage, initialTechs }:
const isInitialMount = useRef(true)
const lastFetchParams = useRef<string>('')
const lastUrlString = useRef<string>('')

// Sync server-rendered data when initialData/initialPage change after navigation.
// Without this, router.push() re-renders on the server but the client keeps stale state.
useEffect(() => {
setData(initialData)
setCurrentPage(initialPage)
setIsLoading(false)
}, [initialData, initialPage])

// Memoize filters from URL using primitives to avoid unnecessary recalculations
const urlFilters = useMemo<FilterState>(() => {
Expand Down Expand Up @@ -207,53 +216,26 @@ export function OrganizationsClient({ initialData, initialPage, initialTechs }:
}
}, [])

// Handle page changes from URL
// Only fetch if page actually changed AND we're not on initial mount
useEffect(() => {
if (isInitialMount.current) {
return
}

const page = Number(searchParams.get('page')) || 1
if (page !== currentPage) {
setCurrentPage(page)
// Only fetch if we have dynamic filters (search or complex filters)
// Otherwise, pagination should be handled client-side with static data
const hasDynamicFilters = filters.search ||
filters.yearsLogic === 'AND' ||
filters.categoriesLogic === 'AND' ||
filters.techsLogic === 'AND' ||
filters.topicsLogic === 'AND' ||
(filters.years.length > 0 && filters.categories.length > 0 && filters.techs.length > 0)

if (hasDynamicFilters) {
fetchOrganizations(page, filters)
}
}
}, [searchParams, currentPage, filters, fetchOrganizations])
// Page changes are handled via router.push() → server re-render → initialData sync.
// No client-side fetch needed for pagination.

// Only fetch when filters change (not on initial mount, as we have initialData)
// Only fetch if we have dynamic filters that require API (search or complex filters)
// Filters and search are handled server-side via router.push() → server re-render → initialData sync.
// Only need client-side API fetch for AND logic filters (rare edge case).
useEffect(() => {
if (isInitialMount.current) {
return
}

// Determine if we need API (same logic as server)
const needsAPI =
filters.search.trim().length > 0 ||
filters.yearsLogic === 'AND' ||
filters.categoriesLogic === 'AND' ||
filters.techsLogic === 'AND' ||
filters.topicsLogic === 'AND' ||
(filters.years.length > 0 && filters.categories.length > 0 && filters.techs.length > 0 && filters.topics.length > 0)
filters.topicsLogic === 'AND'

// Only fetch if we need API, otherwise filters are handled client-side with static data
if (!needsAPI) {
return
}

// Reset to page 1 when filters change
const page = 1
setCurrentPage(page)
fetchOrganizations(page, filters)
Expand Down Expand Up @@ -341,7 +323,7 @@ export function OrganizationsClient({ initialData, initialPage, initialTechs }:
<div className="flex">
{/* Sidebar - Fixed left, 280px width */}
<aside className="hidden lg:block w-[280px] shrink-0 bg-background fixed top-20 lg:top-24 left-4 h-[calc(100vh-5rem)] lg:h-[calc(100vh-6rem)] overflow-y-auto custom-scrollbar">
<FiltersSidebar onFilterChange={handleFilterChange} filters={filters} availableTechs={initialTechs} />
<FiltersSidebar onFilterChange={handleFilterChange} filters={filters} availableTechs={initialTechs} firstTimeCount={firstTimeCount} />
</aside>

{/* Main Content - with left margin for sidebar */}
Expand Down
49 changes: 16 additions & 33 deletions app/organizations/page.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -96,14 +96,6 @@ function shouldUseAPI(params: {
techsLogic?: string;
topicsLogic?: string;
}): boolean {
// Always use API for search (text search requires DB)
if (params.q && params.q.trim().length > 0) {
if (process.env.NODE_ENV === 'development') {
console.log('[ORGS] Using API: search query detected');
}
return true;
}

// Use API for complex filter logic (AND mode requires DB)
if (params.yearsLogic === 'AND' || params.categoriesLogic === 'AND' ||
params.techsLogic === 'AND' || params.topicsLogic === 'AND') {
Expand All @@ -113,26 +105,10 @@ function shouldUseAPI(params: {
return true;
}

// Use API if multiple filter types are combined (complex combinations)
const filterCount = [
params.years && params.years.trim().length > 0,
params.categories && params.categories.trim().length > 0,
params.techs && params.techs.trim().length > 0,
params.topics && params.topics.trim().length > 0,
params.firstTimeOnly === 'true',
].filter(Boolean).length;

// If more than 2 filter types, use API for better performance
if (filterCount > 2) {
if (process.env.NODE_ENV === 'development') {
console.log('[ORGS] Using API: multiple filter types detected', filterCount);
}
return true;
}

// Otherwise, use static JSON
// All other cases (including text search) use static JSON.
// Text search over ~500 orgs in memory is fast and includes new orgs not yet in DB.
if (process.env.NODE_ENV === 'development') {
console.log('[ORGS] Using static JSON: simple filters or no filters');
console.log('[ORGS] Using static JSON');
}
return false;
}
Expand Down Expand Up @@ -204,12 +180,13 @@ async function getOrganizations(params: {
);
}

// Filter organizations in memory
// Filter organizations in memory (supports text search + all filters)
let filtered = indexData.organizations;

// Apply filters
if (params.years || params.categories || params.techs || params.topics || params.firstTimeOnly) {
const hasFilters = params.q || params.years || params.categories || params.techs || params.topics || params.firstTimeOnly || params.tech;
if (hasFilters) {
filtered = filterOrganizations(indexData.organizations, {
query: params.q,
years: params.years ? params.years.split(',').map(y => parseInt(y)).filter(n => !isNaN(n)) : undefined,
categories: params.categories ? params.categories.split(',') : undefined,
techs: params.techs ? params.techs.split(',') : params.tech ? [params.tech] : undefined,
Expand Down Expand Up @@ -239,8 +216,8 @@ export default async function OrganizationsPage({ searchParams }: PageProps) {
const params = await searchParams;
const page = Number(params.page) || 1;

// Parallel data fetching: Orgs + Tech Stack
const [data, techStackIndex] = await Promise.all([
// Parallel data fetching: Orgs + Tech Stack + Org index (for first-time count)
const [data, techStackIndex, orgIndex] = await Promise.all([
getOrganizations({
page,
limit: 20,
Expand All @@ -257,7 +234,8 @@ export default async function OrganizationsPage({ searchParams }: PageProps) {
techsLogic: params.techsLogic,
topicsLogic: params.topicsLogic,
}),
loadTechStackIndexData()
loadTechStackIndexData(),
loadOrganizationsIndexData()
]);

// Transform tech stack data for sidebar
Expand All @@ -266,6 +244,10 @@ export default async function OrganizationsPage({ searchParams }: PageProps) {
count: t.org_count
})) || [];

const firstTimeCount = orgIndex?.organizations.filter(
(o: { first_time: boolean | null }) => o.first_time === true
).length ?? 0;

return (
<Suspense fallback={
<div className="min-h-[600px] flex items-center justify-center">
Expand All @@ -279,6 +261,7 @@ export default async function OrganizationsPage({ searchParams }: PageProps) {
initialData={data}
initialPage={page}
initialTechs={initialTechs}
firstTimeCount={firstTimeCount}
/>
</Suspense>
);
Expand Down
Loading
Loading