Steven Gutstein
email: s.m.gutstein at Google's mail service
phone: (914)-721-0049
Links: Resume (pdf) -- About Me -- Research Interests -- Publications

About Me

I earned a Ph.D. in Computer Science, while doing research on transfer learning techniques for deep neural nets. These techniques enable a learner to transfer knowledge previously obtained while learning one set of tasks to help learn another set. This makes it possible to effectively learn new tasks with very small sets of labeled data for training.

I have also earned a B.A. & M.A. in Physics. My studies in Physics mainly involved chaos, laser physics and plasma physics. However, as my research began to involve simulating behaviors that were far removed from the world I inhabit, I began to seek out studies that felt more practical to me.

After studying Physics, I became an actuary because I was attracted to the intersection of mathematics and real world phenomena. While working as an actuary, I earned admission to the Society of Actuaries as an associate member. However, I left in order to pursue studies in Computer Science.

Machine learning has been a far better fit; I enjoy its balance of creativity, mathematical analysis and practicality.

Return to Top

Research Interests

My interest in machine learning stems from a desire to enable machines to interact with dynamic environments in a human-like fashion. This includes the abilities to learn both continuously and with few examples. One way to acquire these capabilities is through transfer learning - the use of previously acquired knowledge to aid the acquisition of new knowledge. The human experience has several examples of transfer learning. For example, familiarity with Spanish facilitates learning Italian. My dissertation research (described below) focused on developing transfer learning techniques.

I also participated in research while employed at Lockheed Martin during three summer internships. I contributed to projects involving type-2 fuzzy logic, differential evolution and,(in a flash-back to my time studying laser physics) an experimental laser wave mixing project. These projects were motivated by defense applications ranging from automatic detection of suspicious ship movement to designs for sensors capable of detecting extremely low levels of specific chemical compounds.

Dissertation Research

In the course of my dissertation, I developed two new transfer learning techniques for neural nets.
  • Structurally based knowledge transfer, which exploits the layered architecture of a deep net to determine which parts may be reused productively for new problem(s).
  • Latent learning, which achieves transfer learning without any training for the new tasks. The term, latent learning, was taken from psychology. It refers to learning that occurs without any specific training or reinforcement, but instead, lies latent until needed. The key insight for this approach is that when a net learns consistent responses to a set of input classes, it passively acquires consistent responses to related sets of input classes. When this happens, it is easier and, for small training sets, better to learn these latent responses, rather than to train the net to respond to a new set of related classes with target outputs arbitrarily chosen by the net's trainer.

Evaluation of these techniques using recognition of handwritten characters as a test-bed problem, indicates that:
  • They significantly increase accuracy when a learner only has access to a small set of labeled data for learning new tasks
  • They improve the ability to filter out noise when learning new tasks.
  • The accuracy of my latent learning method is even further improved when followed by traditional supervised learning.

These techniques are likely to have practical applications in fields such as computer vision, object recognition, biometrics, text classification, speech recognition, speech production and robotics. However, further work is needed in order to develop 'next-gen' methods to:

  • train a source net so that it will be able to more effectively transfer knowledge;

  • determine a net's behavior with respect to target classes other than the ones I've used so far;

  • train the net on the target classes after transfer learning has occurred and achieve a positive transfer back to the source classes;

  • transfer knowledge concerning which features of the raw input image should be ignored for classification. For example, when performing handwritten character recognition, it is not relevant whether the letters are white on a black background, or black on a white background. Once a net has learned to ignore, or compensate for, these features among a source set of classes it should be able to transfer that knowledge to help distinguish among a target set of classes.

Return to Top



Journal Papers

  • Steven Gutstein, Olac Fuentes and Eric Freudenthal, Knowledge Transfer in Deep Convolutional Neural Nets, International Journal on Artificial Intelligence Tools (IJAIT) 17(3): 555-567 (2008)

Refereed Conference Publications



  • Steven Gutstein, Olac Fuentes and Eric Freudenthal: Latent Learning in Deep Neural Nets. Proceedings of International Joint Conference on Neural Nets, Barcelona, Spain, July 2010.




Return to Top

Two Fun Problems From The ITA Problem Bank

ITA Software used to pose problems on their webpage. These problems were generally fun, since they not only required me to think abstractly about algorithmic design, but also concretely about how best to implement them in code. So, in the interest of sharing some fun puzzle/puzzle solutions, here are my approaches to the 'Lucky Sevens' problem and the Bitvector problem:
Lucky Sevens Problem
Bitvector Genealogy Problem

Return to Top