Return to site

Diabetes Prediction Test

Predicting Diabetes Risk with a Linear Regression Model

· DiabetesRiskPrediction,EarlyDetectionSavesLives,PreventDiabeteswithML



Diabetes is a chronic metabolic disorder that affects millions of people worldwide. Early detection and lifestyle changes can significantly reduce the risk of developing diabetes and its complications. In this project, I developed a simple linear regression model using a dataset from Kaggle ( to predict the likelihood of an individual having diabetes based on some basic tests.


The dataset contains information on various health parameters of individuals, along with their diabetes status. I trained a linear regression model using this data, which achieved an accuracy of over 75% in correctly identifying individuals with diabetes.

To predict your own diabetes risk, you would need to undergo the following tests:

1. Number of times pregnant: This applies only to females and represents the number of times the individual has been pregnant.

2. Plasma glucose concentration: This is a measure of the glucose (sugar) level in your blood plasma, obtained through a 2-hour oral glucose tolerance test. The test involves drinking a glucose solution and measuring blood sugar levels before and 2 hours after consumption.

3. Diastolic blood pressure (mm Hg): This is the pressure in your blood vessels between heartbeats, measured in millimeters of mercury (mm Hg). It can be obtained using a blood pressure monitor.

4. Triceps skin fold thickness (mm): This measures the thickness of the skin fold on the back of your upper arm (triceps) and is used to estimate body fat percentage. It is measured using calipers.

5. 2-Hour serum insulin (mu U/ml): This is a measure of the level of insulin in your blood 2 hours after consuming a glucose solution during the oral glucose tolerance test.

6. Body mass index (BMI): BMI is a measure of body fat based on height and weight. It is calculated by dividing your weight in kilograms by the square of your height in meters (kg/m^2).

7. Diabetes pedigree function: This is a function that estimates the genetic risk of diabetes based on family history. It takes into account the diabetes status of close relatives.

8. Age (years): Your current age in years.

Once you have these test results, they can be inputted into the trained linear regression model to predict your likelihood of having diabetes. The model will output a probability score between 0 and 1, with higher values indicating a higher risk of diabetes.

Importance of Early Detection:

Knowing with a high probability that you are in a dangerous zone for developing diabetes is crucial for making timely lifestyle changes that could save your life. Diabetes, if left untreated or poorly managed, can lead to serious complications such as heart disease, stroke, kidney failure, blindness, and lower limb amputations. By being aware of your risk early on, you can take proactive steps to improve your health and prevent or delay the onset of diabetes.

Measures to Improve Health:

If your test results indicate a high risk of diabetes, there are several measures you can take to improve your health:

1. Maintain a healthy weight: If you are overweight, losing 5-7% of your body weight can significantly reduce your diabetes risk. Focus on a balanced diet and regular exercise.

2. Exercise regularly: Aim for at least 150 minutes of moderate-intensity exercise or 75 minutes of vigorous-intensity exercise per week. This can help improve insulin sensitivity and blood sugar control.

3. Eat a balanced diet: Choose foods that are rich in fiber, whole grains, lean proteins, and healthy fats. Limit your intake of processed foods, sugary drinks, and saturated fats.

4. Manage stress: Chronic stress can contribute to insulin resistance and high blood sugar levels. Practice stress-management techniques such as meditation, deep breathing, or yoga.

5. Get enough sleep: Lack of sleep can disrupt hormone levels and contribute to insulin resistance. Aim for 7-9 hours of sleep per night.

6. Monitor your blood sugar: If you have a high risk of diabetes, your healthcare provider may recommend regular blood sugar monitoring to catch any changes early on.


While this linear regression model provides a simple way to estimate diabetes risk, it is important to note that it is not a substitute for professional medical advice. The model provides a probability score, but it is not a definitive diagnosis. If you have concerns about your diabetes risk, it is best to consult with a healthcare provider who can provide personalized recommendations based on your individual health profile. Nonetheless, this project demonstrates the potential of machine learning in helping individuals assess their health risks and make informed lifestyle choices. By being proactive about your health and making positive changes, you can significantly reduce your risk of developing diabetes and its complications. 

broken image