Skip to content

Potential Projects

Spencer Smith edited this page Jun 4, 2020 · 38 revisions

This page identifies potential projects to improve and expand Drasil. The scope of these projects is larger than a single issue (Drasil Issue Tracker). Where a good issues should generally be close-able with less than a week of effort, the projects listed here will likely take longer. Moreover, not all of the project details have been worked out, so the path to closure for each project still needs to be determined. Each project will likely be completed by decomposing it into a series of issues.

All of the projects are larger than a single issue, but beyond that characterization there is considerable variability in their scope. Some are suitable for a summer student research project, while others would be more appropriate for an MEng, Masters or PhD project.

The information given for each project is just a starting point. All of the potential projects require further thought and refinement. An initial version of several of the potential projects is given in the SE4SC Repo. The SE4SC repo (not public) provides some additional brainstormed ideas.

Incorporate Pandoc into DocLang.

Pandoc is a Haskell library for converting from one markup format to another. Instructions on using Pandoc are available at: Pandoc Web-page for Users, while the code itself is maintained in the following repo: Pandoc GitHub Repo. From the Pandoc GitHub page: > Pandoc has a modular design: it consists of a set of readers, which parse text in a given format and produce a native representation of the document (an abstract syntax tree or AST), and a set of writers, which convert this native representation into a target format.

The AST for the documentation representation should be compared to the Drasil representation of a document. We could use the comparison to improve DocLang, or maybe even replace DocLang with Pandoc. The readers and writers may also be helpful for conversion between different document formats.

Scientific knowledge ontology.

The organization of the code for Drasil has been refactored as patterns emerge. In particular, when opportunities for reuse are observed, the code is changed to facilitate this. We are implicitly capturing scientific information and the relationship between different theories, definitions, etc. If we could make this information explicit, it could facilitate future additions of knowledge to Drasil. An ontology of scientific knowledge would be useful in its own right, especially one that is backed up by being (at least partially) formalized. An informal ontology for scientific knowledge is currently implicitly available in the organization of books, papers, journals, curricula etc, but we are not aware of a formal model of scientific knowledge.

Capturing scientific knowledge will require categorizing concepts and determining their properties and the relations between them. As knowledge is gained errors and inconsistencies in SCS can be avoided at earlier and earlier stages. For instance, as an ontology of physics knowledge takes form, Drasil will know the concept of the length of a beam makes sense, but the "length" of water, does not. Similarly, Drasil will be able to generate warnings, or error messages, when a variable representing length is assigned a negative value, or when Poisson's ratio is outside of the admissible range [0, 0.5], or when a fluid mechanics theory is used outside the laminar flow range (Reynolds number less than 2010 for flow in a pipe).

Some sample ontologies that might be relevant include:

An overview of ontologies is given on Wiki Page on Ontology Related Definitions

Developing our own ontology can be driven by the Drasil examples. As the information is organized for reuse, an ontology should naturally arise. For instance, the SSP example has the local concept of stress, but this really is a concept that applies for any example in continuum mechanics. However the concept of effective stress should be reserved for continuum mechanics problems where the material is granular, such as for a soil. This example is discussed as part of a pull request.

Scientific Project Related Knowledge Ontology

This ontology is based on scientific project knowledge that Drasil already captures. It is not the scientific knowledge related to the domain knowledge (discussed in another potential project); it is the things like people, documents, expressions, references, etc. The plan is to encode the knowledge using OWL, where OWL stands for the Web Ontology Language (OWL). OWL languages have a formal semantics and are built on the Resource Description Framework (RDF).

The initial steps are:

  1. collect all of the objects that Drasil directly can talk about [there are actually very few!]
  2. collect all the classes of objects that Drasil knows about [there are more]
  3. determine all the properties that Drasil can encode

These are all in the Drasil code (in multiple packages).

Some useful links include:

We will look for ontologies that are close to what we can already talk about, and then point out the differences between the prior work and Drasil.

Generation of test cases based on data constraints and properties of a correct solution

Test cases can be generated from the typical inputs, data constraints, and the properties of a correct solution. Generating test cases from the data constraints will involve improving the constraint representation, as partially introduced in #1220.

The test case generation facility should also be incorporated into the Travis continuous integration system, so that the generated code for the case studies is automatically tested with each build. Currently we test that the generated code compiles, but we do not test to see if it passes any test cases.

Add 3D Model of the Aorta to Drasil

Cardiovascular diseases are a leading cause of death globally. Lives could potentially be saved if we had a means of early detection of different diseases. Automatic construction of a 3D model of the aorta from CT scans would help researchers and clinicians. Currently 3D aorta reconstruction is done with a mix of manual and automatic tools. Improving the automation would save significant time, which could mean shorter visits to the doctor, and possibly fewer visits.

A tool for automatic reconstruction does not currently exist. An interesting case study of Drasil would be to build this tool, and the associated documetation, using Drasil facilities. The existing Drasil examples were created a priori and translated into Drasil.

Generate Jupyter notebooks - probably starting with generating simple physics experiments

Jupyter notebooks are commonly used to present "worksheets" that present the theory, code and computational results together. This is just a different view of the same information that is already in Drasil. Showing that we can generated Jupyter notebooks would highlight the flexibility of Drasil. It would also highlight the kind of knowledge that we can manipulate.

The examples for the Jupyter notebook could start with simple physics examples, possible borrowing ideas from Learn You a Physics for Great Good

Many potential simple physics problems are given at: My Physics Lab

Automatic Check for Completeness, Correctness and Consistency

If information is missing, Drasil should inform the user. The following information can be checked:

  • Necessary information is provided, or explicitly indicated as not applicable. Necessary information could include properties of a correct solution. The properties of a correct solution are easy to pass neglect at the early stages, but attention to this detail can definitely pay dividends down the road.
  • The number of inputs is sufficient to find the output of a given equation.
  • Every "chunk" is used at least once. If a theoretical model, for instance, is never referenced elsewhere in the documentation, then it is likely irrelevant for the given problem. As another example, all assumptions should be invoked somewhere, or they shouldn't be in the documentation.
  • It seems likely that every instance model should be invoked by at least one requirement. Automatic generation of the traceability between requirements and instance models will help determine whether this is a realistic check.
  • Add "sanity checkers" to review the Drasil code. These checkers should prevent "silly" mistakes. For instance, if there is a min and a max specified for data constraints, the min should be less than (or equal to?) the max.

It should be possible to turn the checks off, since there could be cases where the user wants to ignore the warnings.

Complete and Fix Incomplete Case Studies

The following Drasil case studies do not generate code: Game Physics, SSP and SWHS. These examples should be completed. The SRS for Game Physics also needs to be carefully reviewed. As it is right now, the inputs and outputs for the game physics library are not complete, or consistent.

To get SSP working will mainly require hooking it into an external library for optimization. The same pattern as for using external libraries with noPCM can be used with SSP, but with optimization libraries.

Add New Case Studies to Drasil

The following projects could be added to Drasil. They are suggested for one or more of the following reasons: they would be of interest to potential students, they are in an area not covered by the current Drasil examples, they are more ambitious than the current Drasil examples:

  • machine learning
  • discrete probability density function
  • family of data fitting algorithms
  • family of finite element analysis programs
  • family of convex hull algorithms

Drasil currently focuses on physics based examples. Adding general purpose research software tools would be helpful, since they provide a bridge between the physics problems and how the problems are solved numerically. For instance, the fitting used in GlassBR and in SFS (Software for Solidification (not in Drasil)) could be made much more generic. We could have a family of fitting algorithms that could be used in any situation where fitting is required. A proper commonality analysis of this domain could potentially show the potential design decisions that bridge between the requirements and the design. In the SFS example many different fitting routines were tried. If the experiments could have been done easily via a declarative specification, considerable time would have been saved. If the experiments are combined with automated testing and "properties of a correct solution" the human involvement could be reduced, so that we have partially automated algorithm selection.

Clone this wiki locally