ConferenceCall 2024 03 20: Difference between revisions
Ontolog Forum
No edit summary |
|||
(9 intermediate revisions by the same user not shown) | |||
Line 21: | Line 21: | ||
== Agenda == | == Agenda == | ||
* '''[[ | * '''[[TillMossakowski|Till Mossakowski]]''' ''Neuro-symbolic integration for ontology-based classification of structured objects'' [https://bit.ly/3TDfNqM Slides] | ||
** Abstract: Reference ontologies play an essential role in organising knowledge in the life sciences and other domains. They are built and maintained manually. Since this is an expensive process, many reference ontologies only cover a small fraction of their domain. We develop techniques that enable the automatic extension of the coverage of a reference ontology by extending it with entities that have not been manually added yet. The extension shall be faithful to the (often implicit) design decisions by the developers of the reference ontology. While this is a generic problem, our use case addresses the Chemical Entities of Biological Interest (ChEBI) ontology with classes of molecules, since the chemical domain is particularly suited to our approach. ChEBI provides annotations that represent the structure of chemical entities (e.g., molecules and functional groups).<br/>We show that classical machine learning approaches can outperform ClassyFire, a rule-based system representing the state of the art for the task of classifying new molecules, and is already being used for the extension of ChEBI. Moreover, we develop RoBERTa and Electra transformer neural networks that achieve even better performance. In addition, the axioms of the ontology can be used during the training of prediction models as a form of semantic loss function. Furthermore, we show that ontology pre-training can improve the performance of transformer networks for the task of prediction of toxicity of chemical molecules. Finally, we show that our model learns to focus attention on more meaningful chemical groups when making predictions with ontology pre-training than without, paving a path towards greater robustness and interpretability. This strategy has general applicability as a neuro-symbolic approach to embed meaningful semantics into neural networks. | |||
** Bio: Till Mossakowski is a professor of theoretical computer science at Otto-von-Guericke University of Magdeburg, Germany. He has co-designed the distributed ontology, model and specification language DOL, as well as the corresponding Heterogeneous Tool Set. His research interests are logic, knowledge representation, semantics, and neural-symbolic integration, as well as applications in energy network simulation models, chemistry and material sciences. | |||
** [https://bit.ly/3Tzib0u Video Recording] | |||
== Conference Call Information == | == Conference Call Information == | ||
Line 33: | Line 36: | ||
== Participants == | == Participants == | ||
* [[BevCorwin|Bev Corwin]] | |||
* [[DouglasRMiles|Douglas R Miles]] | |||
* [[FabianNeuhaus|Fabian Neuhaus]] | |||
* [[GaryBergCross|Gary Berg-Cross]] | |||
* [[JohnSowa|John Sowa]] | |||
* [[KenBaclawski|Ken Baclawski]] | |||
* Lotti Tu | |||
* [[MichaelGruninger|Michael Grüninger]] | |||
* Phil Jackson | |||
* [[RaviSharma|Ravi Sharma]] | |||
* Riley Moher | |||
* [[TillMossakowski|Till Mossakowski]] | |||
* [[ToddSchneider|Todd Schneider]] | |||
== Discussion == | == Discussion == | ||
[12:11] Ravi Sharma: Are symbols of chemistry so well understood that it can not only identify the chemical but also whether these are allowed based on valence etc? | |||
[12:15] Ravi Sharma: Till the classical is more than training set as it incorporates rules of chemistry as well? | |||
[12:21] Ravi Sharma: where do you use reasoners? | |||
[12:22] Ravi Sharma: so AI enhances deep learning further but does it use transformers? | |||
[12:23] John Sowa: For many important applications, 99% accuracy is unacceptable. In finance, for example, accuracy to a fraction of a cent is an absolute requirement. | |||
[12:25] John Sowa: How can formal deduction be used to ensure absolute consistency with a database and a precise ontology? | |||
[12:25] Ravi Sharma: Bidirectional learning with filtering is a great feature but how much is improvement due to this? | |||
[12:28] Phil Jackson: Can the approach be applied to discovery of patterns of information within DNA molecules? | |||
[12:28] Ravi Sharma: is the pretraining improvement because it learns the chemical rules? | |||
[12:28] Douglas R. Miles: Can a Transformer be used store clause rules ? | |||
[12:29] Douglas R. Miles: store and suggest rules that is* | |||
[12:41] Todd Schneider: John, wouldn’t it be the case that LLMs and neural networks should not be used for some applications? | |||
[12:46] Michael Grüninger: No — the point is that the models of the axioms are isomorphic to graphs (as mathematical structures) | |||
[12:47] Michael Grüninger: It’s not the language that is graphical | |||
[12:48] Todd Schneider: Michael, is your response to John’s comment about ‘linear’ languages’ and graphs? | |||
[12:50] Michael Grüninger: Yes | |||
[12:57] John Sowa: Every graph can be linearized, and every linear notation can be mapped to a graph. | |||
[12:58] John Sowa: But there are an enormous number of different kinds of graphs and linear notations. | |||
[12:58] John Sowa: For different applications and different problems or questions, different representations may be better than others. | |||
[12:59] John Sowa: Fundamental principle: there is no single paradigm that is ideal for all possible applications. | |||
[13:00] John Sowa: Two paradigms are better than one, and multiple paradigms are even better. | |||
[13:00] Michael Grüninger: A Molecular Structure Ontology for Medicinal Chemistry Chui and Grüninger FOIS 2016 | |||
[13:00] Riley Moher: Does the attention in the classifier reveal any kind of features or structural properties of graphs? And would you expect this to be correlated with axioms of the ontology? | |||
[13:02] Fabian Neuhaus: @Michael: we also have translated the structural representation in ChEBi in FOL. Unfortunately, the many non-leaf nodes in ChEBI do not contain structural definitions, and of those many are incomplete. Thus, the translation to FOL lead to wrong axioms / definition. | |||
[13:04] Lotti Tu: thank you for the great talk! | |||
[13:05] Riley Moher: Thank you, very interesting | |||
[13:06] Bev Corwin: Thank you | |||
== Resources == | == Resources == | ||
* [https://bit.ly/3TDfNqM Slides] | |||
* [https://bit.ly/3Tzib0u Video Recording] | |||
== Previous Meetings == | == Previous Meetings == |
Latest revision as of 01:51, 21 March 2024
Session | Foundations and Architectures |
---|---|
Duration | 1 hour |
Date/Time | 20 Mar 2024 16:00 GMT |
9:00am PDT/12:00pm EDT | |
4:00pm GMT/5:00pm CST | |
Convener | Ravi Sharma |
Ontology Summit 2024 Foundations and Architectures
Agenda
- Till Mossakowski Neuro-symbolic integration for ontology-based classification of structured objects Slides
- Abstract: Reference ontologies play an essential role in organising knowledge in the life sciences and other domains. They are built and maintained manually. Since this is an expensive process, many reference ontologies only cover a small fraction of their domain. We develop techniques that enable the automatic extension of the coverage of a reference ontology by extending it with entities that have not been manually added yet. The extension shall be faithful to the (often implicit) design decisions by the developers of the reference ontology. While this is a generic problem, our use case addresses the Chemical Entities of Biological Interest (ChEBI) ontology with classes of molecules, since the chemical domain is particularly suited to our approach. ChEBI provides annotations that represent the structure of chemical entities (e.g., molecules and functional groups).
We show that classical machine learning approaches can outperform ClassyFire, a rule-based system representing the state of the art for the task of classifying new molecules, and is already being used for the extension of ChEBI. Moreover, we develop RoBERTa and Electra transformer neural networks that achieve even better performance. In addition, the axioms of the ontology can be used during the training of prediction models as a form of semantic loss function. Furthermore, we show that ontology pre-training can improve the performance of transformer networks for the task of prediction of toxicity of chemical molecules. Finally, we show that our model learns to focus attention on more meaningful chemical groups when making predictions with ontology pre-training than without, paving a path towards greater robustness and interpretability. This strategy has general applicability as a neuro-symbolic approach to embed meaningful semantics into neural networks. - Bio: Till Mossakowski is a professor of theoretical computer science at Otto-von-Guericke University of Magdeburg, Germany. He has co-designed the distributed ontology, model and specification language DOL, as well as the corresponding Heterogeneous Tool Set. His research interests are logic, knowledge representation, semantics, and neural-symbolic integration, as well as applications in energy network simulation models, chemistry and material sciences.
- Video Recording
- Abstract: Reference ontologies play an essential role in organising knowledge in the life sciences and other domains. They are built and maintained manually. Since this is an expensive process, many reference ontologies only cover a small fraction of their domain. We develop techniques that enable the automatic extension of the coverage of a reference ontology by extending it with entities that have not been manually added yet. The extension shall be faithful to the (often implicit) design decisions by the developers of the reference ontology. While this is a generic problem, our use case addresses the Chemical Entities of Biological Interest (ChEBI) ontology with classes of molecules, since the chemical domain is particularly suited to our approach. ChEBI provides annotations that represent the structure of chemical entities (e.g., molecules and functional groups).
Conference Call Information
- Date: Wednesday, 20 March 2024
- Start Time: 9:00am PDT / 12:00pm EDT / 5:00pm CET / 4:00pm GMT / 1600 UTC
- ref: World Clock
- Note: The US and Canada are on Daylight Saving Time while Europe has not yet changed.
- Expected Call Duration: 1 hour
- Video Conference URL: https://bit.ly/48lM0Ik
- Conference ID: 876 3045 3240
- Passcode: 464312
The unabbreviated URL is: https://us02web.zoom.us/j/87630453240?pwd=YVYvZHRpelVqSkM5QlJ4aGJrbmZzQT09
Participants
- Bev Corwin
- Douglas R Miles
- Fabian Neuhaus
- Gary Berg-Cross
- John Sowa
- Ken Baclawski
- Lotti Tu
- Michael Grüninger
- Phil Jackson
- Ravi Sharma
- Riley Moher
- Till Mossakowski
- Todd Schneider
Discussion
[12:11] Ravi Sharma: Are symbols of chemistry so well understood that it can not only identify the chemical but also whether these are allowed based on valence etc?
[12:15] Ravi Sharma: Till the classical is more than training set as it incorporates rules of chemistry as well?
[12:21] Ravi Sharma: where do you use reasoners?
[12:22] Ravi Sharma: so AI enhances deep learning further but does it use transformers?
[12:23] John Sowa: For many important applications, 99% accuracy is unacceptable. In finance, for example, accuracy to a fraction of a cent is an absolute requirement.
[12:25] John Sowa: How can formal deduction be used to ensure absolute consistency with a database and a precise ontology?
[12:25] Ravi Sharma: Bidirectional learning with filtering is a great feature but how much is improvement due to this?
[12:28] Phil Jackson: Can the approach be applied to discovery of patterns of information within DNA molecules?
[12:28] Ravi Sharma: is the pretraining improvement because it learns the chemical rules?
[12:28] Douglas R. Miles: Can a Transformer be used store clause rules ?
[12:29] Douglas R. Miles: store and suggest rules that is*
[12:41] Todd Schneider: John, wouldn’t it be the case that LLMs and neural networks should not be used for some applications?
[12:46] Michael Grüninger: No — the point is that the models of the axioms are isomorphic to graphs (as mathematical structures)
[12:47] Michael Grüninger: It’s not the language that is graphical
[12:48] Todd Schneider: Michael, is your response to John’s comment about ‘linear’ languages’ and graphs?
[12:50] Michael Grüninger: Yes
[12:57] John Sowa: Every graph can be linearized, and every linear notation can be mapped to a graph.
[12:58] John Sowa: But there are an enormous number of different kinds of graphs and linear notations.
[12:58] John Sowa: For different applications and different problems or questions, different representations may be better than others.
[12:59] John Sowa: Fundamental principle: there is no single paradigm that is ideal for all possible applications.
[13:00] John Sowa: Two paradigms are better than one, and multiple paradigms are even better.
[13:00] Michael Grüninger: A Molecular Structure Ontology for Medicinal Chemistry Chui and Grüninger FOIS 2016
[13:00] Riley Moher: Does the attention in the classifier reveal any kind of features or structural properties of graphs? And would you expect this to be correlated with axioms of the ontology?
[13:02] Fabian Neuhaus: @Michael: we also have translated the structural representation in ChEBi in FOL. Unfortunately, the many non-leaf nodes in ChEBI do not contain structural definitions, and of those many are incomplete. Thus, the translation to FOL lead to wrong axioms / definition.
[13:04] Lotti Tu: thank you for the great talk!
[13:05] Riley Moher: Thank you, very interesting
[13:06] Bev Corwin: Thank you
Resources
Previous Meetings
Session | |
---|---|
ConferenceCall 2024 03 13 | LLMs, Ontologies and KGs |
ConferenceCall 2024 03 06 | LLMs, Ontologies and KGs |
ConferenceCall 2024 02 28 | Foundations and Architectures |
... further results |
Next Meetings
Session | |
---|---|
ConferenceCall 2024 03 27 | Foundations and Architectures |
ConferenceCall 2024 04 03 | Synthesis |
ConferenceCall 2024 04 10 | Synthesis |
... further results |