Related references
Note: Only part of the references are listed.
Article
Computer Science, Theory & Methods
Nils Rethmeier et al.
Summary: Modern NLP methods use self-supervised pretraining objectives, such as masked language modeling, to improve downstream task performance. Contrastive self-supervised training objectives have been successful in image representation pretraining, but in NLP, a single token augmentation can cause issues. This primer summarizes recent self-supervised and supervised contrastive NLP pretraining methods and their applications in language modeling, zero to few-shot learning, pretraining data-efficiency, and specific NLP tasks.
ACM COMPUTING SURVEYS
(2023)
Article
Computer Science, Artificial Intelligence
Shaoxiong Ji et al.
Summary: This survey provides a comprehensive review of knowledge graphs, covering topics such as knowledge graph representation learning, knowledge acquisition and completion, temporal knowledge graphs, and knowledge-aware applications. The study proposes a categorization and taxonomies on these topics, as well as explores emerging themes like metarelational learning, commonsense reasoning, and temporal knowledge graphs. Additionally, the research offers curated data sets and open-source libraries to facilitate future research in the field of knowledge graphs.
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
(2022)
Article
Computer Science, Artificial Intelligence
Linting Xue et al.
Summary: Most widely used language models are based on token sequences, but token-free models operating directly on raw text have many advantages. This study shows that a standard Transformer architecture can be used with minimal modifications to process byte sequences, and byte-level models are more robust and perform better on spelling and pronunciation tasks.
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
(2022)
Article
Computer Science, Artificial Intelligence
Jonathan H. Clark et al.
Summary: Pipelined NLP systems have been replaced by end-to-end neural modeling, but most models still require explicit tokenization. Canine, a neural encoder, operates directly on character sequences without tokenization or vocabulary, and combines downsampling and deep transformer stack for effective and efficient processing. Canine outperforms the mBert model on multilingual benchmark TyDi QA, despite having fewer parameters.
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
(2022)
Article
Computer Science, Artificial Intelligence
Xin He et al.
Summary: Deep learning techniques have achieved remarkable results in various tasks, but building a high-quality DL system requires human expertise. Automated machine learning is a promising solution that is currently being extensively researched.
KNOWLEDGE-BASED SYSTEMS
(2021)
Article
Mechanics
Preetum Nakkiran et al.
Summary: A study reveals a 'double-descent' phenomenon in modern deep learning tasks, where performance initially worsens and then improves as model size increases. The concept of effective model complexity is introduced to unify these phenomena and conjecture a generalized double descent based on this measure. Additionally, certain scenarios are identified where increasing the number of training samples, even quadrupling, can actually harm test performance.
JOURNAL OF STATISTICAL MECHANICS-THEORY AND EXPERIMENT
(2021)
Article
Engineering, Electrical & Electronic
Bernhard Schoelkopf et al.
Summary: The fields of machine learning and graphical causality have started to influence each other and show interest in benefiting from each other's advancements. Understanding fundamental concepts of causal inference, and relating them to key issues in machine learning, can help enhance modern machine learning research. A central problem in the intersection of AI and causality is the learning of causal representations, which involves discovering high-level causal variables from low-level observations.
PROCEEDINGS OF THE IEEE
(2021)
Proceedings Paper
Computer Science, Artificial Intelligence
Arjun Gopalan et al.
Summary: Neural Structured Learning (NSL) is a new learning paradigm in TensorFlow that trains neural networks by leveraging structured signals. NSL can represent structure explicitly or implicitly, and is widely used in various products and services at Google.
WSDM '21: PROCEEDINGS OF THE 14TH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING
(2021)
Article
Computer Science, Artificial Intelligence
William Merrill et al.
Summary: This study investigates the abilities of ungrounded systems to acquire meaning, finding that assertions enable semantic emulation in languages with strong semantic transparency. However, in languages where the same expression can have different values in different contexts, emulation may become uncomputable. In conclusion, the limitations of ungrounded language models in understanding semantic representations are formally defined.
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
(2021)
Article
Computer Science, Theory & Methods
Connor Shorten et al.
Summary: This survey examines the utilization of Deep Learning in combating the COVID-19 pandemic and provides insights into future research directions. It covers applications in Natural Language Processing, Computer Vision, Life Sciences, and Epidemiology. The survey highlights the limitations of Deep Learning in COVID-19 applications and showcases examples of its effectiveness in fighting the pandemic.
JOURNAL OF BIG DATA
(2021)
Article
Biochemical Research Methods
Xiangxiang Zeng et al.
JOURNAL OF PROTEOME RESEARCH
(2020)
Article
Computer Science, Artificial Intelligence
Mandar Joshi et al.
TRANSACTIONS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS
(2020)
Review
Medical Informatics
Irena Spasic et al.
JMIR MEDICAL INFORMATICS
(2020)
Article
Computer Science, Theory & Methods
Connor Shorten et al.
JOURNAL OF BIG DATA
(2019)
Proceedings Paper
Computer Science, Theory & Methods
Xing Wu et al.
COMPUTATIONAL SCIENCE - ICCS 2019, PT IV
(2019)
Proceedings Paper
Computer Science, Artificial Intelligence
Rui Wang et al.
PROCEEDINGS OF THE 2019 GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE (GECCO'19)
(2019)
Article
Computer Science, Theory & Methods
Justin M. Johnson et al.
JOURNAL OF BIG DATA
(2019)
Article
Multidisciplinary Sciences
Alistair E. W. Johnson et al.
Proceedings Paper
Computer Science, Theory & Methods
Joseph D. Prusa et al.
PROCEEDINGS OF 2016 IEEE 17TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION (IEEE IRI)
(2016)
Proceedings Paper
Computer Science, Artificial Intelligence
Joseph Prusa et al.
2015 IEEE 16TH INTERNATIONAL CONFERENCE ON INFORMATION REUSE AND INTEGRATION
(2015)
Article
Computer Science, Cybernetics
Chris Seiffert et al.
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS
(2010)