Frequently Asked Questions

General Questions

What does neuralstudio.ai do?

neuralstudio.ai makes powerful automated machine learning available to anyone and everyone who has data – without requiring knowledge of the underlying algorithms or the arcane arts of tuning algorithm hyper-parameters. neuralstudio.ai also handles the tedious chore of preparing data for use by machine learning algorithms – and provides comprehensive insights into the utility and impact of specific fields in the data files. Finally, neuralstudio.ai. offers multiple paths to place the results of machine learning into service. Algorithms can be executed with new data right on neuralstudio.ai, or algorithms can be run on-premises, either in Microsoft Excel or integrated with existing or new custom applications, using separately licensed NeuralStudio software components.neuralstudio.ai even supports ai for embedded IoT (Internet of Things) systems – algorithm tuning can be performed with ease on neuralstudio.ai, and when ready for deployment, object code can be created for loading into microcontrollers. Contact sales@neuralstudio.ai for further information about IoT options.

How can I use neuralstudio.ai?

It’s easy! Just click the Sign Up button, provide your name, e-mail address, and country of residence, and select a Guest account to get started. Be sure to review our Terms of Use and our Privacy Policy. After your account is activated, you can work with one of the sample files that we will put in your account, then you can upload your own data files. Click the ? icon to get help with the steps in an operation (job). At the end of every job, you’ll receive an e-mail with a detailed report attached that describes the job and results. Any additional output can be downloaded from your Dashboard.

What are the limitations of Guest accounts?

Guest accounts let you use all the capabilities of neuralstudio.ai at no charge for up to 60 days. However, since usage consumes resources, we have to limit the size of files that can be uploaded to Guest accounts, and limit the compute time that can be applied to Guest account jobs.

  • Each individual data file uploaded to a Guest account cannot be larger than 2 MB.
  • A maximum of 20 data files may be uploaded to a Guest account.
  • The total storage a Guest account can use is limited to 2 GB (this includes uploaded files as well as the models, reports, and result files that are created by neuralstudio.ai).
  • The number of records in a file uploaded to a Guest account is limited to 1500 (only 1500 records will be processed, even if the file contains more records).
  • The number of fields in a record is limited to 20 (if there are more than 20 fields the Basic Data Preparation job will fail). In addition, for a file intended to be used in Enhanced Data Preparation, the number of active fields is limited to 15.
  • The maximum number of classes for a Classification model is 6.
  • The maximum number of Epochs permitted for Clustering training is 500.

You can view the limits on your account by visiting the Account tab in the Profile section of your Dashboard.

What are the limitations of other accounts?

The limitations of accounts other than Guest and Student accounts are “sanity checks” to ensure that huge files are not mistakenly uploaded, or that jobs do not run forever. The limits identified here can be increased if requested (contact help@neuralstudio.ai).

  • The default maximum size for a single uploaded file is 75 MB.
  • Individual accounts are allowed up to 10 GB of storage.
  • The maximum number of data records is 1 million.
  • The maximum number of fields in a record is 500.
  • The maximum number of fields for an Enhanced Data Preparation job is 50.
  • The maximum number of classes for a Classification model is 30.
  • The maximum number of Epochs permitted for Clustering training is 5000.

You can view the limits on your account by visiting the Account tab in the Profile section of your Dashboard.

There are also maximum values for Genetic Algorithm parameters; these limits are enforced by the User Interface, but may also be increased upon request.

neuralstudio.ai will terminate processing if the following time constrains are exceeded.

  • Enhanced Data Preparation jobs are allowed 4 hours of compute time per field, plus an additional 6 hours.
  • Single Model jobs are allowed up to 4 or 5 hours of compute time (5 hours for Premium Resources).
  • For Model Ensemble Optimization, an average of 3 hours of compute time per model is allowed.
How do I pay for neuralstudio.ai services?

If you have a Guest account, neuralstudio.ai services are free, subject to limitations described above. For other accounts, you pay for neuralstudio.ai services using an internationally recognized American Express, Mastercard, or Visa credit or debit card. All neuralstudio.ai services are priced in US dollars, and your card will be charged in US dollars. Your card statements will show a charge from NeuralStudio SEZC in the Cayman Islands.

Are neuralstudio.ai subscriptions available?

If you would like a neuralstudio.ai subscription for a team, please contact sales@neuralstudio.ai. We offer a range of pricing and capacity options for 6-month, one-year, and multi-year subscriptions.

Operational Questions

Why can’t I Sign In?

When you initially Sign Up with neuralstudio.ai, you receive a confirmation e-mail at the e-mail address you supplied when you signed up. You must click the confirmation link in that e-mail in order to complete your account registration. If you do not recall receiving the e-mail, please be sure to look in the spam folder of your e-mail app, since the .ai domain is not as common a domain as .com. If you cannot find the registration confirmation e-mail, please contact help@neuralstudio.ai.

What is Basic Data Preparation?

Basic Data Preparation is the initial processing that must be applied to every uploaded data file. During Basic Data Preparation, the value of every field in every record is checked, and if any field in a record is in error the record is not included in the *_base.txt file that is created (* represents the name of the original uploaded data file).

A field value is considered “in error” if the value is missing (no value exists in the file), the value contains the Unknown Data Marker (if one has been defined), or the value contains a non-numeric character when the field should contain characters that together can be interpreted as a number. A field is considered a numeric field if most of the field values are numeric across the entire file.

Basic Data Preparation also identifies certain patterns of values that are not appropriate for machine learning algorithms (such as monotonically increasing sequence numbers) and marks such fields as IGNORE for training (although the values are kept in the *_base.txt file).

The *_base.txt file that results from Basic Data Preparation is a clean file ready for use by any machine learning algorithm.

The corresponding *_preparation_report file contains information about the distributions of values for every field and occurrences of errors. The report can help quickly resolve data problems, as well as suggest which fields might have under-represented values.

What is HD5?

HD5 is a file format designed for storing and organizing very large files. An HD5 file consists of two types of objects – datasets, which are multidimensional arrays; and groups, which are containers that can hold datasets as well as other groups.


What is Enhanced Data Preparation?

Enhanced Data Preparation involves additional processing of the *_base.txt file produced by Basic Data Preparation (* represents the name of the original data file that was uploaded). The purpose of Enhanced Data Preparation is to create a model (aka a Field Model) for all active data fields (fields which have not been flagged IGNORE, by the you or by Basic Data Preparation analysis, and which are not Target fields). Enhanced Data Preparation also creates a single Clustering model which partitions all available data into similar groups. The resulting set of models can then be used when a primary model, created using the *_base.txt file, is placed into service and presented with data records that contain bad or missing values. Use of Field Models in a production environment requires licensing additional NeuralStudio software components.

What is Model Ensemble Optimization?

Model Ensemble Optimization is a machine learning process in which (1) parameters that govern algorithm learning are selected and tuned based on actions of a Genetic Algorithm (GA) optimizer, and then (2) high-performing combinations of individual models are identified by a second GA optimizer. In most applications, the combined (averaged) results of a small number of individual models outperforms the best individual model.

What does it cost to use neuralstudio.ai?

There are two elements of the price for a neuralstudio.ai operation (job). First, there is a fixed base price for each type of job, and a fixed additional surcharge if you choose to use non-standard (i.e., faster) resources. Second, there is a variable price element that depends on the estimated compute time required for the job. The combination of the two elements becomes the price for the job. The price of a job is always displayed for you to accept before a job starts. If you accept the price, you will never be charged more, even if the job takes longer than the estimated time. You can view current prices for operations on the Pricing page.

What happens if an error occurs during a job?

If an error occurs during a job you will be notified in the e-mail that is sent at the end of every job. In addition, neuralstudio.ai support staff are automatically notified. We will review internal logs and reports to attempt to identify the root cause. In most cases you will not be charged when a job terminates due to a processing error. Please see our Terms of Use.

Can a job be cancelled?

In principle a job can always be cancelled by clicking the STOP THIS JOB button in the Status of your Current Job bar of the Dashboard. However, due to the nature of job processing on Azure, as well as the fact that some job processing finishes very quickly but the status bar is not updated immediately, in practice some jobs may actually complete before the cancel request take effect. All jobs that are successfully cancelled are subject to a Cancellation Fee. Please see our Terms of Use.

Why can’t I run concurrent jobs?

To prevent unchecked use of resources, no jobs can be run concurrently in Guest accounts. Users with other accounts cannot run two Enhanced Data Preparation jobs concurrently, two Model Ensemble Optimization jobs concurrently, or an Enhanced Data Preparation job and a Model Ensemble Optimization job concurrently. These jobs may consume considerable resources so we want to ensure that Users are fully aware of cost implications when the jobs are run. Also, if two jobs are started concurrently, the jobs MUST use different data sources (the underlying raw data file must be different). One or more concurrent jobs will likely fail if they use the same data source.

What are progress e-mails?

The contact e-mail address for an account always receives an e-mail when a job ends. Progress e-mails are offered to permit monitoring the progress of potentially long-running jobs such as Enhanced Data Preparation and Model Ensemble Optimization, without staying signed in. Currently, progress notifications are sent by e-mail; in the future text message notifications will be offered. By default, progress e-mails are sent only to the account contact e-mail address. However, if a different e-mail address is entered on the Payment page, both the account contact e-mail address and the additional e-mail address will receive progress notifications, and the additional e-mail address will also receive the end-of-job e-mail.

Can I use neuralstudio.ai technology on-premises?

NeuralStudio offers multiple paths to deploy neuralstudio.ai technology for on-premises (local desktop or enterprise data center) use. NeuralWorks neural networks created and evaluated on neuralstudio.ai can be loaded by a Microsoft Excel add-in and used as an Excel MACRO to process data in a standard Excel spreadsheet. For more demanding enterprise application environments, NeuralStudio offers C# and Java Run-Time Software Development Kits, which provide a programming interface for embedding neuralstudio.ai technology in enterprise applications. If Enhanced Data Preparation Field Models are available, an additional software component which seamlessly and dynamically replaces missing or bad field values can also be integrated. Use of neuralstudio.ai technology on-premises requires additional licenses from NeuralStudio. Please contact sales@neuralstudio.ai to discuss your requirements.

Technical Questions

What are the technical limitations of neuralstudio.ai?

The following are technical limitations of neuralstudio.ai and cannot be changed.

- A field (name or value) can have a maximum of 64 alphanumeric characters (based on ISO 8859-1 or Windows-1252 character encoding, in which a character is stored in one byte).

- The maximum number of characters in the name for a field is 30. Field names are truncated to 30 characters if necessary.

- If a field value contains more than 64 characters, the entire record containing the value is ignored. Otherwise, if a field value is alphanumeric, only the first 30 characters are considered when fields are compared for equality, even if the value contains more characters.

- Currently the absolute maximum number of fields permitted in a data file is 1000. - The maximum number of hidden units in a NeuralWorks neural network is 2000.

Why can’t I upload a file?

In most cases, invalid characters in the name of a file (or the path to a file on your local computer) are the cause for file upload problems. Most errors result in an error message which provides specific information, including possibly that the size of the file may exceed limits for your account. After a file is successfully uploaded, additional processing prior to actually starting a Basic Data Preparation job may identify other problems with a file (or, you may inadvertently specify the wrong Field Separator, which will cause errors). After Basic Data Preparation ends, it is important to review the Basic Data Preparation report to confirm that the field information identified by neuralstudio.ai is what you expected. If there are unexpected results, you should fully understand their implications for use of the file by a machine learning algorithm, and correct the file if necessary and upload it again before proceeding.


What is (bad, missing, unknown, invalid) data?

In neuralstudio.ai terminology, “bad” data is data that, while it should be strictly numeric, contains characters which cannot be interpreted as numbers by Java. “Missing” data is exactly that – a field value is empty. In practical terms, that means that a record in a data file contains two consecutive Field Separator characters – such as ,, if comma is specified as the Field Separator. “Unknown” data is data that is marked by a string of characters that you defined as the “Unknown Data Marker” – for example, “N/A”. When neuralstudio.ai encounters the Unknown Data Marker, the field is ignored (treated as missing data), to prevent interpreting the field value incorrectly. “Invalid” data is data that is not valid for domain reasons, such as a numeric value that is unrealistic or not possible in light of the system that generated the value. Checking the validity of data values is not currently supported directly on neuralstudio.ai and must be separately performed on data files.


What kind of algorithms are offered by neuralstudio.ai?

neuralstudio.ai currently offers deep-learning neural networks which are trained using a proprietary implementation of cascade correlation and back-propagation. Other machine learning algorithms will be offered soon, and will be incorporated in the modeling process flow of neuralstudio.ai. If there is a particular algorithm you would like to see on neuralstudio.ai, let us know through info@neuralstudio.ai.


Why can’t I optimize Clustering models?

In general, Clustering models have no defined Target and so a performance metric cannot be readily calculated. By their nature, Clustering models typically do not generate sharp boundaries between clusters, and what represents a “good” Clustering model is domain-specific and often somewhat subjective. One way to gain confidence that a Clustering model is “good” is to create several, and confirm that in general records which were clustered together by one model also were clustered together by other models.


Is it possible to process data files which have fields that are not numeric?

During the neuralstudio.ai Basic Data Preparation operation, all values of all fields in the raw data file are analyzed to determine the fundamental type of each field. The type will be NUMERIC, STRING (alphanumeric), or the field will be marked to IGNORE. A NUMERIC field is a field for which the overwhelming majority of values found in the file are numeric values (and a value which is not numeric causes the corresponding record which contains the value to be ignored). All non-NUMERIC fields are initially considered STRING fields. The Basic Data Preparation process then further analyzes field values, and eliminates (marks as IGNORE) fields whose values would not be useful for a machine learning algorithm. This means fields containing sequential or almost sequential numbers are marked IGNORE, and fields which contain too many (more than 30 by default) unique character strings (for example, a field that contains family names) are also marked IGNORE. In other words, you do not have to explicitly identify whether a field is NUMERIC or STRING – neuralstudio.ai will automatically determine the field type. However, you also can explicitly designate a field IGNORE if you do not want it considered by a machine learning algorithm. NUMERIC field values are used as-is. STRING field values are transformed by a 1-of-N encoding for algorithm use.