**Welcome to Roy R. Lederman's homepage.**

I am an Assistant Professor at the Department of Statistics and Data Science at Yale University. I am also a member of the Quantitative Biology Institute (QBio) at Yale.

In 2015-2018 I was a postdoc in the Program in Applied and Computational Mathematics at Princeton University, working with Amit Singer. In 2014-2015 I was a Gibbs Assistant Professor in the Applied Mathematics Program at Yale University, where I also got my PhD, working with Vladimir Rokhlin and Raphy Coifman. I have a BSc in physics and a BSc in electrical engineering from Tel-Aviv University.

**I am looking for extraordinary postdocs and extraordinary graduate students (at Yale, any graduate program). **

For more information, please see positions and postdocs.

### Interests and Recent work

**Mathematics of data science****The combination of inverse problems and unsupervised learning****Applied harmonics analysis****Numerical analysis and signal processing**: the truncated Fourier transform, prolate functions, the Laplace transform, decaying signals**Empirical geometry of data**: unsupervised learning, manifold learning, diffusion maps, multi-sensor problems**Structural biology and cryo-EM**: inverse problems and unsupervised learning, applications of representation theory, numerical analysis, and data organization to imaging of molecules**Computational biology**: fast search algorithms, statistics of DNA, sequencing, organization of biological data

### Cryo-EM

Cryo-electron microscopy (cryo-EM) is a method for imaging molecules without crystallization. The Nobel Prize in Chemistry 2017 was awarded to Jacques Dubochet, Joachim Frank and Richard Henderson "for the development of cryo-electron microscopy, which both simplifies and improves the imaging of biomolecules." I work on various problems of alignment, classification and signal processing that are motivated by application in cryo-EM with many other applications. I am particularly interested in heterogeneity, i.e. imaging of mixtures of different types of molecules.

I work on “*hyper-molecules*” which represent heterogeneous molecules as higher-dimension objects. The movie below is an example of a reconstruction of a continuously heterogeneous object, using the approach described in **this paper** and **this paper**.

Recent recorded talks:

NYC Computational Cryo-EM Summer Workshop, August 2019

Computational Harmonic Analysis and Data Science, October 2019

**This is one of several approaches that I am developing for the heterogeneity problem in cryo-EM, and for other aspects of cryo-EM. ****For more information on my work in cryo-EM, see project page.**

Preliminary results. See project page.

Acknowledgements: Adam Frost, Lakshmi Miller-Vedam, Joakim Anden

### Numerical Analysis and Signal Processing

### Prolate Functions

The **Truncated Fourier Transform** and its eigenfunctions, **Prolate Spheroidal Wave Functions (PSWF)** and **Generalized Prolate Spheroidal Functions (GPSF)** (also known as Slepian Functions) are frequently encountered in mathematics, physics, signal processing, optics and other areas. Surprisingly, very few resources and code for the numerical computation of GPSFs and their eigenvalues are publicly available. Our sample implementation and associated paper are available at http://github.com/lederman/prol. The code also contains an experimental "open-source proof," which is code for analytical proofs of some of the results that appear in this paper.

### The Laplace Transform and Grunbaum Functions

The **Laplace transform** is frequently encountered in mathematics, physics, engineering and other areas. However, the spectral properties of the Laplace transform tend to complicate its numerical treatment; therefore, the closely related **"Truncated" Laplace Transforms** are often used in applications.

The numerical and analytical properties of the Truncated Laplace Transform are discussed in this paper (dissertation), this paper (part I) and this paper (part II).

### Bounds on Transforms

Lower bounds on the truncated Fourier transform and truncated Laplace transform: see paper.

### Geometry of Data

Alternating Diffusion, a method for recovering the common variable in multi-sensor experiments, is discussed in this paper, this technical report and this project webpage.

A different approach to the common variable recovery problem, which also constructs representations that are invariable to unknown transformations, is discussed in this paper.

** What's going on? Why is everything spinning? See project webpage,
this paper and in this report.**

This experiment has nothing to do with the cryo-EM experiment above. Rotating animals are a very convenient visualization.

### Computational Biology

### Random Permutations Based Alignment

I have developed randomized algorithms for sequencing of DNA and RNA.

**Paper**: "A Random-Permutations-Based Approach to Fast Read Alignment" (RECOMB-SEQ 2013).

Also see this paper about the properties of DNA and sequencing.

**Additional Application: Assembly.**

The algorithm is also used to construct

*approximate*overlap graphs. These graph are used for fast assembly. Unlike other algorithms, this algorithm allows errors in the reads, so

*no error-correction is necessary prior to the construction of the graph*. See: paper.

### Additional Computational Biology Algorithms

**Long-Range "Independence"**

The repetitive nature of DNA strings is one of the challenges in read alignment. When one examines longer substrings of DNA, they appear less repetitive, or more unique; permutations-based algorithms benefit from this property. We describe a way of measuring the property in this paper and ways of using this property in reads with many "indels," in this paper.

**Homopolymer Length Filters**

Homopolymer length filters eliminate the mapping problem caused by homopolymer length errors (ionTorrent/454). See paper.

**More information about my work in computational biology is available at**http://roy.lederman.name/compbio/ .

# Papers

Bandeira, Afonso S; Chen, Yutong; Lederman, Roy R; Singer, Amit Non-Unique Games Over Compact Groups and Orientation Estimation in Cryo-Em Technical Report Forthcoming Forthcoming. BibTeX | Tags: Cryo-EM, Data Science, Multi Reference Alignment, Representation Theory, SDP, Structural Biology @techreport{2019NUG, title = {Non-Unique Games Over Compact Groups and Orientation Estimation in Cryo-Em}, author = {Afonso S. Bandeira and Yutong Chen and Roy R. Lederman and Amit Singer}, year = {2020}, date = {2020-00-00}, keywords = {Cryo-EM, Data Science, Multi Reference Alignment, Representation Theory, SDP, Structural Biology}, pubstate = {forthcoming}, tppubtype = {techreport} } |

Brofos, James; Shu, Rui; Lederman, Roy R A Bias-Variance Decomposition for Bayesian Deep Learning Conference 2019. BibTeX | Tags: Bayesian Deep Learning, Bayesian Formulation, Data Science, Deep Networks, MCMC @conference{2019BiasVariance, title = {A Bias-Variance Decomposition for Bayesian Deep Learning}, author = {James Brofos and Rui Shu and Roy R. Lederman}, year = {2019}, date = {2019-12-00}, keywords = {Bayesian Deep Learning, Bayesian Formulation, Data Science, Deep Networks, MCMC}, pubstate = {published}, tppubtype = {conference} } |

Shnitzer, Tal; Lederman, Roy R; Liuand, Gi-Ren; Talmon, Ronen; Wu, Hau-Tieng Diffusion operators for multimodal data analysis Book Chapter 2019. BibTeX | Tags: Alternating Diffusion, Data Science, Geometry of Data, Machine Learning, Multiview, Signal Processing @inbook{2019Shnitzer, title = {Diffusion operators for multimodal data analysis}, author = {Tal Shnitzer and Roy R Lederman and Gi-Ren Liuand and Ronen Talmon and Hau-Tieng Wu}, year = {2019}, date = {2019-09-11}, keywords = {Alternating Diffusion, Data Science, Geometry of Data, Machine Learning, Multiview, Signal Processing}, pubstate = {published}, tppubtype = {inbook} } |

Lederman, Roy R; Andén, Joakim; Singer, Amit Hyper-Molecules: on the Representation and Recovery of Dynamical Structures, with Application to Flexible Macro-Molecular Structures in Cryo-EM Journal Article Inverse Problems, 2019. Links | BibTeX | Tags: Cryo-EM, Data Science, Generalized Prolated Spheroidal Functions (GPSF), Geometry of Data, Harmonic Analysis, Heterogeneity, Inverse Problems, Machine Learning, MCMC, Numerical Analysis, Optimization, Randomized Algorithms, Signal Processing, Statistics, Structural Biology, Unsupervised Learning @article{hrp2019, title = {Hyper-Molecules: on the Representation and Recovery of Dynamical Structures, with Application to Flexible Macro-Molecular Structures in Cryo-EM}, author = {Roy R Lederman and Joakim Andén and Amit Singer}, url = {https://arxiv.org/abs/1907.01589}, year = {2019}, date = {2019-07-02}, journal = {Inverse Problems}, keywords = {Cryo-EM, Data Science, Generalized Prolated Spheroidal Functions (GPSF), Geometry of Data, Harmonic Analysis, Heterogeneity, Inverse Problems, Machine Learning, MCMC, Numerical Analysis, Optimization, Randomized Algorithms, Signal Processing, Statistics, Structural Biology, Unsupervised Learning}, pubstate = {published}, tppubtype = {article} } |

Shaham, Uri; Lederman, Roy R Learning by Coincidence: Siamese Networks and Common Variable Learning Journal Article Pattern Recognition, 2018. BibTeX | Tags: Alternating Diffusion, Data Science, Deep Networks, Geometry of Data, Machine Learning, Multiview, Optimization, Siamese Networks, Unsupervised Learning @article{siamese2018, title = {Learning by Coincidence: Siamese Networks and Common Variable Learning}, author = {Uri Shaham and Roy R. Lederman}, year = {2018}, date = {2018-00-00}, journal = {Pattern Recognition}, keywords = {Alternating Diffusion, Data Science, Deep Networks, Geometry of Data, Machine Learning, Multiview, Optimization, Siamese Networks, Unsupervised Learning}, pubstate = {published}, tppubtype = {article} } |

Lederman, Roy R Numerical Algorithms for the Computation of Generalized Prolate Spheroidal Functions Miscellaneous 2017. Links | BibTeX | Tags: code, Cryo-EM, Generalized Prolated Spheroidal Functions (GPSF), Harmonic Analysis, Numerical Analysis, open-source, open-source paper, open-source proof, Prolate Spheroidal Wave Functions (PSWF), Signal Processing, SVD, Truncated Fourier Transform @misc{Lederman2017gpsf, title = {Numerical Algorithms for the Computation of Generalized Prolate Spheroidal Functions}, author = {Roy R. Lederman}, url = {http://github.com/lederman/Prol https://arxiv.org/abs/1710.02874}, year = {2017}, date = {2017-10-08}, number = { arXiv:1710.02874}, keywords = {code, Cryo-EM, Generalized Prolated Spheroidal Functions (GPSF), Harmonic Analysis, Numerical Analysis, open-source, open-source paper, open-source proof, Prolate Spheroidal Wave Functions (PSWF), Signal Processing, SVD, Truncated Fourier Transform}, pubstate = {published}, tppubtype = {misc} } |

Boumal, Nicolas; Bendory, Tamir; Lederman, Roy R; Singer, Amit Heterogeneous multireference alignment: a single pass approach Miscellaneous 2017. Links | BibTeX | Tags: Cryo-EM, Data Science, Geometry of Data, Heterogeneity, Multi Reference Alignment, Signal Processing, Structural Biology @misc{Boumal2017, title = {Heterogeneous multireference alignment: a single pass approach}, author = {Nicolas Boumal and Tamir Bendory and Roy R. Lederman and Amit Singer}, url = {https://arxiv.org/abs/1710.02590}, year = {2017}, date = {2017-10-06}, keywords = {Cryo-EM, Data Science, Geometry of Data, Heterogeneity, Multi Reference Alignment, Signal Processing, Structural Biology}, pubstate = {published}, tppubtype = {misc} } |

Lederman, Roy R; Singer, Amit Continuously heterogeneous hyper-objects in cryo-EM and 3-D movies of many temporal dimensions Miscellaneous 2017. Links | BibTeX | Tags: Cryo-EM, Data Science, Geometry of Data, Harmonic Analysis, Heterogeneity, Machine Learning, Multi Reference Alignment, Optimization, Signal Processing, Structural Biology, Unsupervised Learning @misc{hyperobj2017, title = {Continuously heterogeneous hyper-objects in cryo-EM and 3-D movies of many temporal dimensions}, author = {Roy R. Lederman and Amit Singer}, url = {https://arxiv.org/abs/1704.02899}, year = {2017}, date = {2017-10-04}, journal = {arXiv preprint arXiv:1704.02899}, keywords = {Cryo-EM, Data Science, Geometry of Data, Harmonic Analysis, Heterogeneity, Machine Learning, Multi Reference Alignment, Optimization, Signal Processing, Structural Biology, Unsupervised Learning}, pubstate = {published}, tppubtype = {misc} } |

Stanton, Kelly P; Jin, Jiaqi; Lederman, Roy R; Weissman, Sherman M; Kluger, Yuval Ritornello: high fidelity control-free chromatin immunoprecipitation peak calling Journal Article Nucleic Acids Research, 2017. Links | BibTeX | Tags: ChIP-seq, Computational Biology, DNA, Medicine, Numerical Analysis, Sequencing, Signal Processing, Statistics, Unsupervised Learning @article{Stanton2017, title = {Ritornello: high fidelity control-free chromatin immunoprecipitation peak calling}, author = {Kelly P. Stanton and Jiaqi Jin and Roy R. Lederman and Sherman M. Weissman and Yuval Kluger}, url = {https://doi.org/10.1093/nar/gkx799}, doi = {10.1093/nar/gkx799}, year = {2017}, date = {2017-09-13}, journal = {Nucleic Acids Research}, keywords = {ChIP-seq, Computational Biology, DNA, Medicine, Numerical Analysis, Sequencing, Signal Processing, Statistics, Unsupervised Learning}, pubstate = {published}, tppubtype = {article} } |

Lederman, Roy R; Steinerberger, Stefan Lower Bounds for Truncated Fourier and Laplace Transforms Journal Article Integral Equations and Operator Theory, 87 (4), pp. 529-543, 2017. Links | BibTeX | Tags: Generalized Prolated Spheroidal Functions (GPSF), Harmonic Analysis, Laplace Transform, Numerical Analysis, Prolate Spheroidal Wave Functions (PSWF), Signal Processing, SVD, Truncated Fourier Transform, Truncated Laplace Transform @article{stability2017, title = {Lower Bounds for Truncated Fourier and Laplace Transforms}, author = {Roy R. Lederman and Stefan Steinerberger}, url = {https://link.springer.com/article/10.1007/s00020-017-2364-z https://arxiv.org/abs/1605.03866}, year = {2017}, date = {2017-04-01}, journal = {Integral Equations and Operator Theory}, volume = {87}, number = {4}, pages = {529-543}, keywords = {Generalized Prolated Spheroidal Functions (GPSF), Harmonic Analysis, Laplace Transform, Numerical Analysis, Prolate Spheroidal Wave Functions (PSWF), Signal Processing, SVD, Truncated Fourier Transform, Truncated Laplace Transform}, pubstate = {published}, tppubtype = {article} } |

Aldroubi, Akram; Huang, Longxiu; Krishtal, Ilya; Lederman, Roy R Dynamical sampling with random noise Conference 2017 International Conference on Sampling Theory and Applications (SampTA) IEEE, 2017. Links | BibTeX | Tags: Dynamical Sampling, Harmonic Analysis, Numerical Analysis, Signal Processing @conference{dynamical2017, title = {Dynamical sampling with random noise}, author = {Akram Aldroubi and Longxiu Huang and Ilya Krishtal and Roy R. Lederman}, url = {https://doi.org/10.1109/SAMPTA.2017.8024372}, doi = {10.1109/SAMPTA.2017.8024372}, year = {2017}, date = {2017-03-07}, publisher = {IEEE}, organization = {2017 International Conference on Sampling Theory and Applications (SampTA)}, keywords = {Dynamical Sampling, Harmonic Analysis, Numerical Analysis, Signal Processing}, pubstate = {published}, tppubtype = {conference} } |

Lederman, Roy R; Singer, Amit A Representation Theory Perspective on Simultaneous Alignment and Classification Journal Article Applied and Computational Harmonic Analysis, 2016. Links | BibTeX | Tags: Cryo-EM, Data Science, Geometry of Data, Harmonic Analysis, Heterogeneity, Multi Reference Alignment, Non-Unique-Games, Numerical Analysis, Optimization, Representation Theory, Structural Biology, Unsupervised Learning @article{hetnug2016, title = {A Representation Theory Perspective on Simultaneous Alignment and Classification}, author = {Roy R. Lederman and Amit Singer}, url = {https://arxiv.org/abs/1607.03464}, year = {2016}, date = {2016-07-12}, journal = {Applied and Computational Harmonic Analysis}, keywords = {Cryo-EM, Data Science, Geometry of Data, Harmonic Analysis, Heterogeneity, Multi Reference Alignment, Non-Unique-Games, Numerical Analysis, Optimization, Representation Theory, Structural Biology, Unsupervised Learning}, pubstate = {published}, tppubtype = {article} } |

Lederman, Roy R; Rokhlin, Vladimir On the Analytical and Numerical Properties of the Truncated Laplace Transform - Part II Journal Article SIAM Journal on Numerical Analysis, 54 (2), pp. 665–687, 2016. Links | BibTeX | Tags: Harmonic Analysis, Laplace Transform, Numerical Analysis, Signal Processing, SVD, Truncated Laplace Transform @article{laplaceii2016, title = {On the Analytical and Numerical Properties of the Truncated Laplace Transform - Part II}, author = {Roy R. Lederman and Vladimir Rokhlin}, url = {https://doi.org/10.1137/15M1028583}, doi = {10.1137/15M1028583}, year = {2016}, date = {2016-03-15}, journal = {SIAM Journal on Numerical Analysis}, volume = {54}, number = {2}, pages = {665–687}, keywords = {Harmonic Analysis, Laplace Transform, Numerical Analysis, Signal Processing, SVD, Truncated Laplace Transform}, pubstate = {published}, tppubtype = {article} } |

Lederman, Roy R; Talmon, Ronen Learning the geometry of common latent variables using alternating-diffusion Journal Article Applied and Computational Harmonic Analysis, 2015. Links | BibTeX | Tags: Alternating Diffusion, Data Science, Geometry of Data, Harmonic Analysis, Machine Learning, Manifold Learning, Multiview, Signal Processing, Unsupervised Learning @article{altdiff2015, title = {Learning the geometry of common latent variables using alternating-diffusion}, author = {Roy R. Lederman and Ronen Talmon}, url = {https://doi.org/10.1016/j.acha.2015.09.002 http://cpsc.yale.edu/sites/default/files/files/tr1497.pdf}, doi = {10.1016/j.acha.2015.09.002}, year = {2015}, date = {2015-11-10}, journal = {Applied and Computational Harmonic Analysis}, keywords = {Alternating Diffusion, Data Science, Geometry of Data, Harmonic Analysis, Machine Learning, Manifold Learning, Multiview, Signal Processing, Unsupervised Learning}, pubstate = {published}, tppubtype = {article} } |

Lederman, Roy R; Rokhlin, Vladimir On the Analytical and Numerical Properties of the Truncated Laplace Transform - I Journal Article SIAM Journal on Numerical Analysis, 53 (3), pp. 1214-1235, 2015. Links | BibTeX | Tags: Harmonic Analysis, Laplace Transform, Numerical Analysis, Signal Processing, SVD, Truncated Laplace Transform @article{laplacei2015, title = {On the Analytical and Numerical Properties of the Truncated Laplace Transform - I}, author = {Roy R. Lederman and Vladimir Rokhlin}, url = {https://doi.org/10.1137/140990681}, doi = {10.1137/140990681}, year = {2015}, date = {2015-05-12}, journal = {SIAM Journal on Numerical Analysis}, volume = {53}, number = {3}, pages = {1214-1235}, keywords = {Harmonic Analysis, Laplace Transform, Numerical Analysis, Signal Processing, SVD, Truncated Laplace Transform}, pubstate = {published}, tppubtype = {article} } |

Shaham, Uri; Lederman, Roy R Common Variable Discovery and Invariant Representation Learning using Artificial Neural Networks Technical Report YALE/DCS (1506), 2015. Links | BibTeX | Tags: Alternating Diffusion, Data Science, Deep Networks, Geometry of Data, Machine Learning, Multiview, Siamese Networks, Unsupervised Learning @techreport{deepsiam2015, title = {Common Variable Discovery and Invariant Representation Learning using Artificial Neural Networks}, author = {Uri Shaham and Roy R. Lederman}, url = {http://cpsc.yale.edu/sites/default/files/files/tr1506.pdf}, year = {2015}, date = {2015-05-00}, number = {1506}, institution = {YALE/DCS}, keywords = {Alternating Diffusion, Data Science, Deep Networks, Geometry of Data, Machine Learning, Multiview, Siamese Networks, Unsupervised Learning}, pubstate = {published}, tppubtype = {techreport} } |

Lederman, Roy R; Talmon, Ronen; Wu, Hau-tieng; Lo, Yu-Lun; Coifman, Ronald R Alternating diffusion for common manifold learning with application to sleep stage assessment Conference 2015 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP), IEEE IEEE, 2015, ISBN: 978-1-4673-6997-8. Links | BibTeX | Tags: Alternating Diffusion, Data Science, Geometry of Data, Machine Learning, Medicine, Multiview, Unsupervised Learning @conference{altdiffacassp2015, title = {Alternating diffusion for common manifold learning with application to sleep stage assessment}, author = {Roy R. Lederman and Ronen Talmon and Hau-tieng Wu and Yu-Lun Lo and Ronald R. Coifman}, url = {http://ieeexplore.ieee.org/abstract/document/7179075/}, doi = {10.1109/ICASSP.2015.7179075}, isbn = {978-1-4673-6997-8}, year = {2015}, date = {2015-04-19}, booktitle = {2015 IEEE International Conference on Acoustics Speech and Signal Processing (ICASSP)}, pages = {5758--5762}, publisher = {IEEE}, organization = {IEEE}, keywords = {Alternating Diffusion, Data Science, Geometry of Data, Machine Learning, Medicine, Multiview, Unsupervised Learning}, pubstate = {published}, tppubtype = {conference} } |

Lederman, Roy R On the Analytical and Numerical Properties of the Truncated Laplace Transform PhD Thesis Yale University, 2014. Links | BibTeX | Tags: Harmonic Analysis, Laplace Transform, Numerical Analysis, Signal Processing, SVD, Truncated Laplace Transform @phdthesis{laplace2014, title = {On the Analytical and Numerical Properties of the Truncated Laplace Transform}, author = {Roy R. Lederman}, url = {http://cpsc.yale.edu/sites/default/files/files/tr1490.pdf}, year = {2014}, date = {2014-05-00}, number = {1497}, institution = {YALE/DCS}, school = {Yale University}, keywords = {Harmonic Analysis, Laplace Transform, Numerical Analysis, Signal Processing, SVD, Truncated Laplace Transform}, pubstate = {published}, tppubtype = {phdthesis} } |

Lederman, Roy R Using the Long Range “Independence” in DNA: Coupled-Seeds and Pre-Alignment Filters Technical Report YALE/DCS (1477), 2013. Links | BibTeX | Tags: Alignment (DNA), Data Science, DNA, Fast Algorithms, Fast Search, Geometry of Data, Sequencing @techreport{dnalongrange2013, title = {Using the Long Range “Independence” in DNA: Coupled-Seeds and Pre-Alignment Filters}, author = {Roy R. Lederman}, url = {http://www.cs.yale.edu/publications/techreports/tr1477.pdf}, year = {2013}, date = {2013-08-07}, number = {1477}, institution = {YALE/DCS}, keywords = {Alignment (DNA), Data Science, DNA, Fast Algorithms, Fast Search, Geometry of Data, Sequencing}, pubstate = {published}, tppubtype = {techreport} } |

Lederman, Roy R A random-permutations-based approach to fast read alignment Journal Article BMC bioinformatics, 14 (5), pp. S8, 2013, (RECOMB-seq 2013). Abstract | Links | BibTeX | Tags: Alignment (DNA), Computational Biology, Data Science, DNA, Fast Algorithms, Fast Search, Geometry of Data, Randomized Algorithms, Sequencing @article{randperm2013, title = {A random-permutations-based approach to fast read alignment}, author = {Roy R. Lederman}, url = {https://doi.org/10.1186/1471-2105-14-S5-S8}, doi = {10.1186/1471-2105-14-S5-S8}, year = {2013}, date = {2013-04-10}, journal = {BMC bioinformatics}, volume = {14}, number = {5}, pages = {S8}, abstract = { Background Read alignment is a computational bottleneck in some sequencing projects. Most of the existing software packages for read alignment are based on two algorithmic approaches: prefix-trees and hash-tables. We propose a new approach to read alignment using random permutations of strings. Results We present a prototype implementation and experiments performed with simulated and real reads of human DNA. Our experiments indicate that this permutations-based prototype is several times faster than comparable programs for fast read alignment and that it aligns more reads correctly. Conclusions This approach may lead to improved speed, sensitivity, and accuracy in read alignment. The algorithm can also be used for specialized alignment applications and it can be extended to other related problems, such as assembly.}, note = {RECOMB-seq 2013}, keywords = {Alignment (DNA), Computational Biology, Data Science, DNA, Fast Algorithms, Fast Search, Geometry of Data, Randomized Algorithms, Sequencing}, pubstate = {published}, tppubtype = {article} } Background Read alignment is a computational bottleneck in some sequencing projects. Most of the existing software packages for read alignment are based on two algorithmic approaches: prefix-trees and hash-tables. We propose a new approach to read alignment using random permutations of strings. Results We present a prototype implementation and experiments performed with simulated and real reads of human DNA. Our experiments indicate that this permutations-based prototype is several times faster than comparable programs for fast read alignment and that it aligns more reads correctly. Conclusions This approach may lead to improved speed, sensitivity, and accuracy in read alignment. The algorithm can also be used for specialized alignment applications and it can be extended to other related problems, such as assembly. |

Lederman, Roy R A permutations-based algorithm for fast alignment of long paired-end reads Technical Report YALE/DCS (1474), 2013. Links | BibTeX | Tags: Alignment (DNA), Data Science, DNA, Fast Algorithms, Fast Search, Geometry of Data, Randomized Algorithms, Sequencing @techreport{dnapaired2013, title = {A permutations-based algorithm for fast alignment of long paired-end reads}, author = {Roy R. Lederman}, url = {http://ftp.cs.yale.edu/publications/techreports/tr1474.pdf}, year = {2013}, date = {2013-04-10}, number = {1474}, institution = {YALE/DCS}, keywords = {Alignment (DNA), Data Science, DNA, Fast Algorithms, Fast Search, Geometry of Data, Randomized Algorithms, Sequencing}, pubstate = {published}, tppubtype = {techreport} } |

Lederman, Roy R A Note about the Resolution-Length Characteristics of DNA Miscellaneous 2013. Links | BibTeX | Tags: Computational Biology, Data Science, DNA, Geometry of Data, Randomized Algorithms, Sequencing, Signal Processing, Statistics @misc{resolutionlength2013, title = {A Note about the Resolution-Length Characteristics of DNA}, author = {Roy R. Lederman}, url = {http://www.cs.yale.edu/publications/techreports/tr1473.pdf}, year = {2013}, date = {2013-04-02}, number = {1473}, institution = {YALE/DCS}, keywords = {Computational Biology, Data Science, DNA, Geometry of Data, Randomized Algorithms, Sequencing, Signal Processing, Statistics}, pubstate = {published}, tppubtype = {misc} } |

Lederman, Roy R Building approximate overlap graphs for DNA assembly using random-permutations-based search Technical Report YALEU/DCS (1470), 2012. Links | BibTeX | Tags: Computational Biology, Data Science, De Novo Assembly, DNA, Fast Algorithms, Fast Search, Geometry of Data, Sequencing @techreport{dnaassmble2012, title = {Building approximate overlap graphs for DNA assembly using random-permutations-based search}, author = {Roy R. Lederman}, url = {http://cpsc.yale.edu/sites/default/files/files/tr1470(1).pdf}, year = {2012}, date = {2012-12-18}, number = {1470}, institution = {YALEU/DCS}, keywords = {Computational Biology, Data Science, De Novo Assembly, DNA, Fast Algorithms, Fast Search, Geometry of Data, Sequencing}, pubstate = {published}, tppubtype = {techreport} } |

Lederman, Roy R Homopolymer Length Filters Miscellaneous 2012. Links | BibTeX | Tags: Computational Biology, Data Science, DNA, Fast Algorithms, Geometry of Data, Sequencing, Signal Processing @misc{Homopolymer2012, title = {Homopolymer Length Filters}, author = {Roy R. Lederman}, url = {http://cs-www2.cs.yale.edu/publications/techreports/tr1465.pdf}, year = {2012}, date = {2012-10-25}, number = {1465}, institution = {YALE/DCS}, keywords = {Computational Biology, Data Science, DNA, Fast Algorithms, Geometry of Data, Sequencing, Signal Processing}, pubstate = {published}, tppubtype = {misc} } |

# Select Teaching

S&DS262/S&DS562 : Computational Tools for Data Science | Yale, Spring 2020 |

S&DS663 : Computational Mathematics for Data Science | Yale, Fall 2019 |

S&DS676 : Signal Processing for Data Science | Yale, Spring 2019 |

S&DS663 : Computational Mathematics for Data Science | Yale, Fall 2018 |

MATH555 / AMTH555 : Elements of Mathematical Machine Learning | Yale, Spring 2015 |

MATH 112 : Calculus of Functions of One Variable I | Yale, Spring 2015 |