Posts

recent work in math knowledge rep 2021Q3

  2021-Q3 survey of recent work in math knowledge rep Background definitions and motives FAIR Data : ( wikipedia )  data which meet principles of findability , accessibility , interoperability , and reusability . go-fair  principles in more depth Some surveys  (Deep) FAIR Mathematics - (KWARC) 2020-02-25 by Katja Berčič, Michael Kohlhase, Florian Rabe  " Big Math and the One-Brain Barrier : The Tetrapod Model of Mathematical Knowledge" by Jacques Carette, William M. Farmer, Michael Kohlhase, Florian Rabe  Expanded group from McMaster + KWARC shared working notes and tables as of 2020-Feb in: " The Space of Mathematical Software Systems -- A Survey of Paradigmatic Systems" A Survey of Languages for Formalizing Mathematics - (KWARC) 2020  by Cezary Kaliszyk and Florian Rabe Some proceedings Accepted Papers at ITP2021 -- 12th International Conference on Interactive Theorem Proving ITP2021  Proceedings   as a single PDF  CADE 2021 Proceedings (open access at Spring

AxioMagic under Spark on cloud (AWS, Azure, ...)

AxioMagic under Spark on ${ cloud  } AxioMagic relies on functional composition over immutable inputs, thus matching modern evaluation frameworks like Apache Spark, which defines a structured approach to managing data contexts of functional evaluation over data frames.  Spark embodies a refinement of both  ETL and MapReduce methodology.  State machines like Spark require computation and storage resources which may come from any cluster, but today are commonly provided from cloud environments, yielding high volume functional computation at predictable usage-based costs.  We especially value Spark for open scientific dataset analysis, where it is common to work with large data files in interchange formats. Essential features of Spark distributed evaluation Spark assembles batch pipelines to pump datasets of structured records through functional tasks (which may  invoke side-effects, with all necessary stipulations).  This pipeline serves to evaluate a given distributed functional algorit

AxioMagic Progress Report: 2021-Sommer

AxioMagic Progress Report: 2021-Sommer 2021-Avril -  AxioMagic  status and direction summarized in website updates + tweetstorm Tweety-txt is now captured (on 2021-08-13), approx 8K ~= 150 lines Worth hammering into another blog entry here?  Think so, yeah.  So please stand by for that. 2021-Mai - importing artifacts from Lean MathLib    2021-Jul - Increasing appreciation for the value of  commutators  and adjoints ! 2021-Aug - importing MXNet via DJL.ai

Domain types based on probability monads

Image
(Important:  I did not create the image shown in blogger preview - it is from NCat Lab website!) Composed stochastic proof for low-dimensional applications  Following up on previous post on Composed Stochastic Proof , we now enumerate: What the typed world owes the authored world Compile-time access to categorical knowledge about probability monads  (overview page at NCat Lab site) over our dependent type system for measure theory  (at MathLib/Lean site).   To knit this knowledge into useful, portable applications, we proceed with four major tasks: Task 1) Starting from a shared library conceptual math knowledge, we identify applicable categories that are useful in our application model.  The NCatLab website is a great current resource on existing work.  The page on monads of probability, measures, and valuations contains this  summary table of the most useful, widely known functional probability concepts. These are the core ingredient math concepts needed for robust, general proce

Composed stochastic proof : joint distribution monad as dependent type

Image
composed stochastic proofs - motivation I posted some background links and concepts on Reddit r/DepdendentTypes Dependent Types for Giry probability monads, e.g. Kantorovich / Wasserstein  (+ followup comment ) In this post, we begin tackling the kinds of distributions needed for our Asset Allocation Example . joint distribution monad as dependent type We define our strong typed stochastic computing structure in these 3 stages 1) Mathematically, in informal human-readable equations and type descriptions      (formally encoded and exposed as visible API in 3 below, using infra from 2) 2) As functional abstractions, using the kantorovich monad formulation      (formally encoded as submerged infrastructure, used by 3 below) 3) As a formal dependent-typed API usable from suitably strong-typed environments such as Lean, F* Idris Agda, Coq, Isabelle, HOL, MMT Scala 3 (Dotty) Restating the WHAT:  The software types defined formally in 3 are refinements of the informal types from 1, relying o

Application Authoring in RDF

In ~2014, we published an open source authoring technology called glue-ml , as part of  glue.ai   (Apache 2.0 license). In structural terms, essentially this technique is just  authoring by graph:    Authors create declarative descriptive graphs, governed by ontology types. In the simplest example, an author can save a .ttl (RDF-turtle) file from a text editor, or from Protege. Other authors may instead use spreadsheets, or a customized domain-specific editor. Semantic and logical validity of authored graphs is up to the author and tools.   Applications read the graphs as RDF using the most convenient technology. The app reading approach is independent of the authoring tool used, since only the information content of the RDF model is important. Application code knows the ontology used by authored graph, which provides the logical design contract. App impl  may use a code-binding approach, when that feels comfortable. Alternately the app impl may treat the authored RDF as a data-sour

Asset Allocation example - continued (part 02)

In our last post on Explainable Models: Asset Allocation Example , we separated our asset allocation problem into two parts: A) Estimating future asset performance and correlation, based on model assumptions and available data B) Choosing best portfolio allocation, based on investor preferences and the estimates from A We define these challenges as being entirely separate, when facilitated by our careful design of the data flowing out of estimation process A.   This clear separation provides practical benefits   to teams and systems undertaking either challenge.   We consider A to be primarily a scientific endeavor, albeit one involving some prognosticator's discretion and potential for folly.  Generally this estimation will rely in large part on commonly available data, and be comparable with  estimates provided by alternative sources.  Improvement of these estimates may be pursued in an objective manner.   Meanwhile B involves idiosyncratic decisions by investors, which are often