Skip to content

Repository Logic

Overview

This project uses Build-Time Indexing to simulate a dynamic file server on GitHub Pages. Since static hosts cannot list files at runtime, we generate a manifest.json during the GitHub Action build.

Github Pages is a static file server. In a way it is like a ephemeral virtual machine (Like an EC2) however after it builds the artifacts, it simply serves it onto an URL in the web. What this implies is that it can not do backend operations like executing a python script etc. . Therefore, we combine github pages with a simple actions logic to update the files we have in our repo.

Normally, when you serve an articats to the web, even though there are no index file (index.html, index.md), we should be able to see the file directory by default. Github pages disables this as a security measure. So we have to create an index.html, bundle it with some simple scripting for file updates to serve the data we have.

1. Backend / Build Pipeline (.github/workflows/deploy.yml)

Critical Logic: The workflow generates the file map. If the directory structure changes (e.g., renaming data/), this step must be updated.

For local build, after every change of the data/ folder, you need to run tree -J data > manifest.json. Make sure you are always in the root directory of the codebase when you do it. manifest.json should be in the same layer where index.html is.

tree gives the file tree, -J flag is for json, and > manifest.json saves it.

In our github pages deployment yml static.yml: after building the page we first generate this. This action is triggered by any push to the repository.

      - name: Generate Directory Manifest
        run: |
          sudo apt-get install -y tree
          tree -J data > manifest.json  # <--- CORE MECHANISM

      # ... deployment steps ...

2. Frontend Logic (js/dashboard.js)

Critical Logic: This script parses the manifest.

A. Configuration & Data Fetching

// CHANGE THIS if the repository is renamed or moved
const GITHUB_REPO_URL = "[https://github.com/ashbate/intangible-data-plaform](https://github.com/ashbate/intangible-data-plaform)"; 

document.addEventListener('DOMContentLoaded', () => {
    // ... setup ...

    // CORE LOGIC: Fetches the build-time manifest + manual metadata
    Promise.all([
        fetch('manifest.json').then(res => res.ok ? res.json() : []),
        fetch('metadata.json').then(res => res.ok ? res.json() : {})
    ])
    // ...

B. DOI & Metadata Mapping The script attempts to find a DOI in metadata.json. If missing, it falls back to a placeholder.

    // DOI LOGIC:
    // 1. Looks for key matches in metadata.json
    // 2. Fallback value defined here:
    let doiValue = "99.9999/DOI.PLACEHOLDER"; 

    if (metadata[yearFolder.name] && metadata[yearFolder.name].doi) {
        doiValue = metadata[yearFolder.name].doi;
    }

C. File Filtering & Link Generation We only display specific file types. Update the Regex here to support new formats (e.g., .pdf or .json).

    folder.contents.forEach(file => {
        // FILTER: Only allows specific extensions
        if (file.type === 'file' && /\.(csv|dta|parquet)$/i.test(file.name)) {

            // ... grouping logic ...
        }
    });

    // ...

    // LINK CONSTRUCTION:
    // Relies on the repo structure: root -> data -> year -> dataset -> file
    linksHtml += `<a href="data/${year}/${folder.name}/${f.name}" ...`;

3. View Layer (index.html)

Critical Logic: The HTML is a skeleton. The ID below must match js/dashboard.js. Namely, container and the loading skeleton.

    <div id="file-list-container" class="space-y-12"></div>

    <script src="js/dashboard.js"></script>

4. Metadata Schema (metadata.json)

This file is manually maintained in the root. It provides context that the automated tree command cannot generate. For every doi per year, this json must be maintained manually.

{
  "2025": {
    "doi": "10.1234/zenodo.2025"  // <--- Maps to Year Header
  },
  "intangible_assets_v1.csv": {
    "description": "Primary dataset..." // <--- Maps to File Description
  }
}

5. File Examples

Example manifest.json (Automated)

Generated by tree -J data during build.

[
  {
    "type": "directory",
    "name": "data",
    "contents": [
      {
        "type": "directory",
        "name": "2025",
        "contents": [
          { "type": "file", "name": "table1.csv" },
          { "type": "file", "name": "table1.parquet" }
        ]
      }
    ]
  }
]

Example metadata.json (Manual)

Created by you in the root folder.

{
  "2025": {
    "doi": "10.5555/TEST.DOI.2025"
  },
  "table1.csv": {
    "description": "Aggregated summary of global intangible assets (CSV)."
  },
  "table1.parquet": {
    "description": "Aggregated summary (Columnar format)."
  }
}