Skip to content

cronjob: updates required for GRASS 8.5.0 release#1692

Open
neteler wants to merge 6 commits into
OSGeo:grass8from
neteler:cronjob_updates_after_g850_release
Open

cronjob: updates required for GRASS 8.5.0 release#1692
neteler wants to merge 6 commits into
OSGeo:grass8from
neteler:cronjob_updates_after_g850_release

Conversation

@neteler

@neteler neteler commented May 9, 2026

Copy link
Copy Markdown
Member

This PR extends #1689 and updates all cronjobs as follows:

Overall:

  • correctly define GRASS versions (as to be modified after GRASS 8.5.0 release)
  • update of comments
  • disabled "old" (8.4) as no longer supported

Scripts "preview" (8.6) and "current" (8.5):

  • fetch HTML manual pages from GitHub (created via GitHub "documentation" action)
  • remove internal creation of HTML manual
  • incorporated fetch_unpack_manual_GHA.sh functionality
  • added/synced sphinx support (removed in b90e6d3)
  • removed sitemap creation as being part of GitHub "documentation" action artifacts
  • "preview": SEO: inject canonical link into versioned manual pages

Script gh_cli_download_artifact.sh:

  • added multi branch support

Removed fetch_unpack_manual_GHA.sh as no longer needed and updated cronjob list accordingly.

On the server grass.osgeo.org: "grass-stable/" and "grass-devel/" are true directories (and neither Apache redirects nor links on disk). They are sync'ed in above scripts using rsync from versioned source directories.

This PR extends OSGeo#1689 and updates all cronjobs as follows:

- correctly define GRASS versions (as to be modified after GRASS 8.5.0 release)
- update of comments

"preview" (8.6) and "current" (8.5):
- fetch HTML manual pages from GitHub (created via GitHub "documentation" action)
- remove internal creation of HTML manual
- incorporated `fetch_unpack_manual_GHA.sh` functionality

Removed `fetch_unpack_manual_GHA.sh` as no longer needed and updated cronjob list accordingly.
@neteler neteler requested review from nilason and wenzeslaus May 9, 2026 22:21
@neteler neteler self-assigned this May 9, 2026
@neteler neteler added the CI Continuous integration label May 9, 2026
Comment on lines 64 to 71
RUN_ID=$(gh run list \
--branch main \
--branch "$MYBRANCH" \
--repo "$OWNER/$REPO" \
--workflow "$WORKFLOW_NAME" \
--status success \
--limit 1 \
--json databaseId \
--jq '.[0].databaseId')

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't go fetch the latest commit-wise I think. It's only the latest ran. So if we do a rerun of an older failed job, like 3 months later, it'll go get that artifact, if my understanding is right. And that's not what we want.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. So we need a more sophisticated use of gh here.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idea: name the documentation artifact with YYYYMMDD-<branch>-mkdocs.zip using the commit date (rather than the workflow date). Then download the latest file to the server etc. That would be done in the "documentation" workflow.

(cd $TARGETHTMLDIRSTABLE ; rm -rf barscales colortables icons northarrows)
# clone manual pages
cp -rp $TARGETHTMLDIR/* $TARGETHTMLDIRSTABLE/
rm -rf $TARGETHTMLDIRSTABLE

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So that means that temporarily, the docs will be unavailable if they are deleted before being rebuilt? (If the process doesn't finish its even worse).

What about having multiple "staging" folders, not served directly, that are like grass85-20260509205104 or something unique enough and the rebuild/unzipping of artifact is done in there, and when done, a symlink swaps what folder is served, atomically. Keep last 2-3 versions on the server. You could swap the symlink to a previous version if something goes wrong.

For not having these folders served but available to be served as symlink, there might be some friction with the configuration of some web servers, which might try to prevent you from serving something outside what is already publicly served.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally I agree. And would welcome implementation support.

# clone manual pages
cp -rp $TARGETHTMLDIR/* $TARGETHTMLDIRSTABLE/
rm -rf $TARGETHTMLDIRSTABLE
rsync -a $TARGETHTMLDIR/ $TARGETHTMLDIRSTABLE/

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a flag needed to propagate the deletions (if the line above is changed according to my other observation)?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Due to the hard cleanup a line below we here left out the (potentially dangerous) delete flag.

Comment thread utils/cronjobs_osgeo_lxd/cron_grass_preview_build_binaries.sh Outdated
Comment thread utils/cronjobs_osgeo_lxd/cron_grass_preview_build_binaries.sh Outdated
Comment on lines +269 to +270
rm -rf $TARGETHTMLDIRDEVEL
rsync -a $TARGETHTMLDIR/ $TARGETHTMLDIRDEVEL/

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar observations as for stable versions

@neteler

neteler commented May 18, 2026

Copy link
Copy Markdown
Member Author

Note: since the test implementation of this PR on the server (as there is no "dry-run" environment), I discovered that this sitemap is actually not existing on the server:
https://grass.osgeo.org/grass-devel/manuals/sitemap_manuals.xml
(404)

Means: there is a naming confusion. See

default="sitemap_manuals.xml",

vs

https://github.com/OSGeo/grass/blob/5bd94dabee99556bf4956e0fd933617a61676085/.github/workflows/documentation.yml#L250
--output "${MKDOCS_DIR}/site/sitemap.xml" \

Which name shall be used? Unless this is fixed, the entire "devel" manual is no longer shown when using search engines like Google etc:

URL is not on Google
This page is not indexed. Pages that aren't indexed can't be served on Google.

@echoix

echoix commented May 18, 2026

Copy link
Copy Markdown
Member

Keep in mind, from what I remember of playing around last year, the generated sitemap is not perfect.

https://github.com/OSGeo/grass/blob/5bd94dabee99556bf4956e0fd933617a61676085/python/grass/docs/conf.py#L265-L267

and

https://github.com/OSGeo/grass/blob/5bd94dabee99556bf4956e0fd933617a61676085/python/grass/docs/conf.py#L499-L502

collide, meaning we can't configure correctly, as that last line will win.

Either the canonical rel links are correct, or the sitemap is correct.

@neteler

neteler commented May 19, 2026

Copy link
Copy Markdown
Member Author

OSGeo/grass@5bd94da/python/grass/docs/conf.py#L499-L502

How about calling it sitemap_sphinx.xml here to avoid a name collision? It doesn't need to be named sitemap.xml.

@echoix

echoix commented May 19, 2026

Copy link
Copy Markdown
Member

OSGeo/grass@5bd94da/python/grass/docs/conf.py#L499-L502

How about calling it sitemap_sphinx.xml here to avoid a name collision? It doesn't need to be named sitemap.xml.

No, it’s the html_baseurl that is assigned twice

@neteler

neteler commented May 20, 2026

Copy link
Copy Markdown
Member Author

Why is that relevant? Sorry, I don't get it.
I'm discussing the 404 of the sitemap file on the server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI Continuous integration

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants