|
|||
|
|
|||
Our project has two complementary goals. One is to provide a basic logical
infrastructure for a global digital library; we call the basic structure a
Logical Library. The second goal is to create new formal computational
content using the logical library, i.e. to use the library in the
"implementation of computational mathematics."
These two aspects are closely related and synergistic. One cannot build and
study a library without giving it content; specifically one cannot build a
library that supports computational content without knowing the deep issues in
that realm including foundational considerations. Moreover, one cannot build
large amounts of useful content without having a logical library, whose
operations support a large scale collection. One cannot have large social
impact without addressing some of the key general issues in digital libraries,
such as long term archival value, heterogeneous content, global access and
so forth.
We describe specific goals, expected impact, technical challenges and proposed
research plan for each of the two aspects of our work. We start with the
logical library infrastructure in Section 1. The Logical Library, and
treat formal content in Section
2. Computational Content.
1. The Logical LibraryOne way to frame the issues involved in building a logical library is to think about the problem of soundness as we change scale - from local to global. The starting point is a single theorem prover proving a single theorem in the context of a small list of previously proved theorems. Even here there are interesting logical questions, many of them were solved by the tactic mechanism which reduces a complex inference to primitive proof steps; these can be double checked by executing a simple proof checker on the primitive proof (systems that allow very simple checkers are said to satisfy the deBruijn principle). Consider the scale in which multiple provers are contributing to the development of a theory (or the hardening of a software system); each is using a local context of a global library, each is adding new tactics that execute in a local environment. How do we know that the combined work is logically sound? How do we know the results will reprove as the environments are extended? How do we make the replay of proofs stable under extension? A logical library will provide these foundational services as well as routine operations and more advanced operations. Below we describe the operations that we aim to support, progressing from the common ones to the more advanced ones that are needed to support a global community. We then discuss the impact of providing these operations. Then we state the technical obstacles to providing these operations, and finally we discuss our plan of attack on these problems.1.1 Library OperationsOur conceptual path to the library design followed our project's own needs to maintain a flexible development method, permitting divergent partially independent developments, and yet to be able to justify claims of validity, exposing the assumptions of such justifications. Our old system organization encouraged the development of one central sound series of extensions to the semantics and rules, with a few rather isolated variations which could not, without great expense, be incorporated back into the project's main line of development. Often these variations would develop significant generally useful bodies of mathematics which could not be reliably or inexpensively exported and put into the main line. Similarly, sometimes a variant might need to delete incompatible parts of the standard, again practically putting it beyond the useful reach of those using the standard library. The need to modify standard tactic code in the variant developments further inhibited their eventual incorporation into the main line. We needed to develop a library that could contain and account for these various partially incompatible, partially independently developed contents. It became intolerable that one could not embark upon an independent development with the assurance of the support one ought to be able to expect from other project members with a common interest. It became intolerable that one could not significantly alter the refinement engine. In short, the centrifugal research needs and the centripetal necessity of collaboration drove us to a new design. Additionally, we desired to "harden" our tactic proofs so that we could rerun them at will however the library might grow, and to allow for variant refiners and independent proof checkers; and yet eventually one cannot afford to recheck every theorem whenever something has been changed. Once the library is large, most changes must be incremental, and one must be able to target the small part that must be rechecked after a modification. The larger and more complex the body of theorems becomes, the more one must rely on such arguments in order to have any confidence in the correctness of ones work. The possibility of such an argument is one of the great values of formalization in the first place, and we must not give it up simply because we require flexibility and naturalness of expression. Below we list the library operations that we want to make routine and sound by logically correct automation. We already implemented most of the common operations within our LPE and now strive for the more advanced ones.
1.2 Technical ChallengesThe major technical challenges in trying to provide library services arise from features special to formal logical libraries, mainly:
1.3 Toward a Common Logical LibraryAll major automated reasoning projects must provide the basic services listed in Section 1.1. In addition to these we have tried to provide some of the more advanced services as part of the work we have done in carrying out formal software design and verification. As a result we have a large and diverse collection of formal objects on which to experiment with various potential solutions. In order to build library services that can be used by workers in different countries, we must first build them to work among users in different offices in the same group and then among members of a team at remote sites. We have been fortunate to be working with three different provers in the same group, Nuprl 4, Nuprl 5 and MetaPRL. We know how to share results, how to share provers and how to translate between theories. This is the environment for creating concepts and tools that will eventually serve a much wider community. Our aim in this project is to create a common Formal Digital Library with all of the above advanced features and make it a service that can be used by a variety of users and projects who wish to create and share existing formal mathematical knowledge. We will start with providing the services locally in the first two years and extending them in the last three to a global resource -- the external infrastructure to which we must link will be clearer then.1.4 Scientific and Social BenefitsProviding a logical library will result in many significant benefits to scientific practice as well as to the social impact of science. First, we will be able to increase the reliability of reference material at a low marginal cost and provide a starting point for the evolution of these mechanism to dramatically lower cost. We can know that collections of definitions and theorems are correct according to specific designated criteria and are consistent. The correctness can be established at the highest levels of assurance known, namely proofs checked by both humans and machines. The process of progressively providing computer certifications for more and more claims asserted in a collection is a process that we call hardening the collection, and it applies to the software systems stored in the library as well. The library provides an arena for gradual formalization.
Second, we contribute to formal mechanisms for guaranteeing the reliability
of large software systems. An interactive logical library can be used to
develop algorithms and even systems that are
correct-by-construction and documented by the context, as we
will demonstrate. Moreover the logical library provides the means to connect
textual documentation to formal documentation.
Third, a logical library will complement the mechanisms of electronic
publishing and open the way to verify journals that specialize in
formalized mathematics [Miz]. In such
journals every result will be checked by certified theorem provers, including
those for which there is a small proof checker that can be publically
scrutinized (this is a system that obeys the so-called
deBruijn principle).
Fourth, there is significant educational value in formal reference
material. We have used such material in teaching and have studied its impact.
In particular one can learn about a particular system in a context where the
design, the specifications, the algorithms and the proofs are all linked to the
relevant literature. Significant benefits accrue from having static formal
material as is now posted on the Nuprl web site, but even greater advantages
come from allowing users to interact with proofs and algorithms.
Readers can explore the consequences of deleting an assumption or strengthening
a conclusion. They can watch an algorithm execute on concrete data and
symbolically. They can ask whether one result depends on another; they can see
exactly how or whether a proof breaks by changing definitions, lemmas,
inference steps and justifications. They can also decompose a high level
inference step, say built from tactics or derived rules, into its constituent
parts, layer by layer as subjective understanding dictates.
Fifth, the growing database of formal computational mathematics is a new
resource for studies in artificial intelligence. As one example,
members of the AI group at Cornell are generating natural language proofs from
parts of the Nuprl corpus [HMBC99]. Interesting
ideas have been proposed for automating more of the process of formalizing
articles and textbooks.
2. Computational ContentNow we turn to the second category of the work, providing formal computational content; doing this involves both formalizing a standard body of computational mathematics and designing a suitable logical language that can naturally express these formalizations. In this section we will discuss the challenges that arise when pursuing these goals and our plans for addressing them.2.1 Using Type Theory to Formalize Computational MathematicsWe have been working on effective proof development and checking methods for the standard body of computational mathematics for over twenty five years; this body of computational mathematics is a subset of the full body of mathematics (for example, much of the Bourbaki effort does not fall under this category nor does much of the Mizar output). This body of mathematics provides the technical underpinnings of a substantial amount of practical work in computer science, especially for designing and building reliable software systems. Experience in program and system verification has shown that it is necessary to formalize a great deal of this mathematics, and over the years a large body of formal mathematics has been created.
Our main focus has been on formalizing discrete mathematics and algebra, a
further specialization, but we have experience in developing formal real
analysis as well. Our considerable experience in these areas will help us in
addressing the applications issues that arise in logical library design.
We have extensively explored the constructive formalization of algebra, logic,
automata theory, program semantics, and data structures (see FDL content). In
some cases we have done experiments in directly formalizing textbook and
journal articles on these topics [CJNU00].
We believe, as is stipulated in the call for proposals, that constructive
reasoning is practically essential as is the ability to reason about
algorithms. We have spent several years developing a programming logic for ML
and various demonstration programming languages. This experience has taught us
how to recognize features of a logical library that are essential in making it
serve as a library of code for a logical programming environment and as a
version control system for programs and proofs.
We have confirmed the value of constructive reasoning in this domain, and have
argued that natural formalization of many of the basic concepts in
computational mathematics, such as computability, decidability and
reducibility, requires distinctions made only in constructive foundations. The
standard logical languages used for what has become known as classical
mathematics cannot express these ideas in the simple terms that computer
scientists, programmers and software engineers use. Our logical language
designs have always recognized this fact, and we have written a great deal to
explain this key idea [CAB+86].
Moreover, it is the case that constructive foundations are more general than
classical, indeed in many technical senses that matter in practice, classical
mathematics is a subtheory of constructive mathematics.
We have considerable experience in exploring and developing constructive foundations, and we were pioneers in exploiting the practical advantages of constructive reasoning in creating software that is correct by construction. This experience will continue to help us in addressing foundational issues that arise in logical library design.
We have been investigating the issue of how to formalize computational
mathematics for many years, at least since 1971, and we have
undertaken the formalization of significant theories as noted in
Section 1. Even though we have written a great deal about this
problem, the research community is only marginally aware that this is not a
completely solved problem. Most people think that it is known how to formalize
classical mathematics, just formalize set theory and then follow Bourbaki. They
believe that a similar solution must exist for computational mathematics. In
reality this is quite a separate problem, as we continue to point out.
We believe that we are working along the best line known for formalizing
computational mathematics. Our accomplishments indicate that it is a very
promising line. We propose to develop formalizations of all data structures and
algorithms that are considered fundamental, and the bulk of our effort is
funded by applications projects. We are also committed to connecting and
relating our work to that of others as is beneficial in some applications. For
example, the PhD thesis of Jason Hickey provides a formal embedding of
constructive set theory into our foundational type theory. The very deep work
of Doug Howe at Bell Labs showed that the core of Nuprl's type theory is
classically consistent and the law of excluded middle can be added to it [How96b], this allows Nuprl to connect to classical
provers such as HOL, Isabelle and PVS in a coherent way.
Creating the right technical infrastructure will enable a global community of users to build a formal digital library of computational mathematics and associated algorithms. 2.2 Foundational Issues and Technical ChallengesOne of the themes of our work is to automate the richest formal language of computational mathematics that we know how to design, and automate in a way that is extensible since if we are successful the language will continue to expand in expressive power. Our experience is that expressiveness greatly leverages theorem proving power. Until recently it was thought that we paid a high price in theorem prover speed because of the expressive power. But the MetaPRL implementation of Nuprl type theory shows that this is not so; on some examples that implementation is nearly one hundred times faster than Nuprl 4, and it competes with Maude, one of the fastest rewrite engines around.The case for expressiveness is clear, formal concepts must match the natural mode of discourse used by people who want to read the library and interact with it. This means that ordinary first-order logical language is not sufficient, and in our view, even higher-order logic is not sufficiently expressive. These points can be made well in comparing the notion of subtyping and inheritance used in the type systems of programming languages with that used in pure mathematics [CH00]. We see that computational type theory can directly express the programming concepts, but traditional set theory can not do so directly. 2.3 Research Agenda in On-going FormalizationWe are presently focusing on two hard problems in formalizing computational mathematics. One is the issue of reasoning at a larger scale - about classes, theories and systems rather than about types, theorems and programs. This forces us to examine even more closely the problem of reflection and metareasoning which has been of interest to us since the 80's. Ideas for solving these problems have a strong effect on the design of the logical library.The ability to reason about classes and systems will allow us to explore the notion that a software system can be thought of as an implemented constructive theory. We call this idea theories-as-systems, and it is a generalization of the concept of proofs-as-programs that we pioneered. We think that this approach is supported well by the class theory operations that are included in the Nuprl type theory.
An important issue for building content is establishing links between the
logical library and a variety of proofs systems. This is precisely the issue of
relating theories and proving at a large scale. Our library has the
capability to connect to other processes, and we intend to use this for
building connections to PVS and ACL2.
Providing a practical implementation of reflection for open ended
computational theories will not only allow us to manage the library and prove
that large scale operations are correct, but it is also a very good route to
formalizing computational complexity. The fact that computational
complexity facts are not part of the standard libraries reveals a major
gap in the foundational basis. We think we can now close this gap [CC00].
Bibliography
|
|
|
|