Modified and created a new python class for generating a report of metrics for machine learningC++ and STL - Machine Learning ProblemPattern recognition and machine learning - Bernoulli mixture modelRandom forest and machine learningStacking and folding machine-learning algorithmTrending Machine Learning topics for AlexaCode for training machine learning linear regression and SVMGreedy adaptive dictionary (GAD) for supervised machine learningTic Tac Toe engine in Python for Deep LearningPython class for organizing images for machine learningCustom Vector and Matrix classes in python for machine learning

How can saying a song's name be a copyright violation?

Why can't we play rap on piano?

What is a romance in Latin?

How do conventional missiles fly?

Can a virus destroy the BIOS of a modern computer?

Extract rows of a table, that include less than x NULLs

Why didn't Boeing produce its own regional jet?

Why was the shrinking from 8″ made only to 5.25″ and not smaller (4″ or less)?

If human space travel is limited by the G force vulnerability, is there a way to counter G forces?

How to show a landlord what we have in savings?

What do you call someone who asks many questions?

Is this a hacking script in function.php?

Arrow those variables!

Examples of smooth manifolds admitting inbetween one and a continuum of complex structures

How does having to sign to support someone for elections fit with having a secret ballot?

Is "remove commented out code" correct English?

Can we compute the area of a quadrilateral with one right angle when we only know the lengths of any three sides?

Mathematica command that allows it to read my intentions

Unable to supress ligatures in headings which are set in Caps

Probability that a draw from a normal distribution is some number greater than another draw from the same distribution

How can I deal with my CEO asking me to hire someone with a higher salary than me, a co-founder?

How much of data wrangling is a data scientist's job?

Is it logically or scientifically possible to artificially send energy to the body?

How would I stat a creature to be immune to everything but the Magic Missile spell? (just for fun)



Modified and created a new python class for generating a report of metrics for machine learning


C++ and STL - Machine Learning ProblemPattern recognition and machine learning - Bernoulli mixture modelRandom forest and machine learningStacking and folding machine-learning algorithmTrending Machine Learning topics for AlexaCode for training machine learning linear regression and SVMGreedy adaptive dictionary (GAD) for supervised machine learningTic Tac Toe engine in Python for Deep LearningPython class for organizing images for machine learningCustom Vector and Matrix classes in python for machine learning













0












$begingroup$


I initially posted a question on SO. I have come up with an answer for the same. Basically, given two dicts of models and parameters, user can create an object, and get the report in 5 steps.



Following is the code.



import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.ensemble import RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import f1_score, roc_auc_score, recall_score, precision_score
from sklearn import datasets
from sklearn.pipeline import Pipeline
from sklearn.base import BaseEstimator
import warnings
warnings.filterwarnings('ignore')

cancer = datasets.load_breast_cancer()
df = pd.DataFrame(cancer.data, columns=cancer.feature_names)
df['target'] = cancer.target
target = df['target']
X_train, X_test, y_train, y_test = train_test_split(df.drop(columns='target', axis=1), target, test_size=0.4, random_state=13, stratify=target)

class ClfSwitcher(BaseEstimator):

def __init__(self, model=RandomForestClassifier()):
"""
A Custom BaseEstimator that can switch between classifiers.
:param estimator: sklearn object - The classifier
"""

self.model = model


def fit(self, X, y=None, **kwargs):
self.model.fit(X, y)
return self


def predict(self, X, y=None):
return self.model.predict(X)


def predict_proba(self, X):
return self.model.predict_proba(X)

def score(self, X, y):
return self.estimator.score(X, y)

class report(ClfSwitcher):
def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.grid = None
self.full_report = None
self.concise_report = None
self.scoring_metrics =
'precision': precision_score,
'recall': recall_score,
'f1': f1_score,
'roc_auc': roc_auc_score



def griddy(self, pipeLine, parameters, **kwargs):
self.grid = GridSearchCV(pipeLine, parameters, scoring='accuracy', n_jobs=-1)


def fit_grid(self, X_train, y_train=None, **kwargs):
self.grid.fit(X_train, y_train)

def make_grid_report(self):
self.full_report = pd.DataFrame(self.grid.cv_results_)

@staticmethod
def get_names(col):
return col.__class__.__name__

@staticmethod
def calc_score(col, metric):
return round(metric(y_test, col.fit(X_train, y_train).predict(X_test)), 4)


def make_concise_report(self):
self.concise_report = pd.DataFrame(self.grid.cv_results_)
self.concise_report['model_names'] = self.concise_report['param_cst__model'].apply(self.get_names)
self.concise_report = self.concise_report.sort_values(['model_names', 'rank_test_score'], ascending=[True, False])
.groupby(['model_names']).head(1)[['param_cst__model', 'model_names']]
.reset_index(drop=True)

for metric_name, metric_func in self.scoring_metrics.items():
self.concise_report[metric_name] = self.concise_report['param_cst__model'].apply(self.calc_score, metric=metric_func)

self.concise_report = self.concise_report[['model_names', 'precision', 'recall', 'f1', 'roc_auc', 'param_cst__model']]

pipeline = Pipeline([
('cst', ClfSwitcher()),
])

parameters = [

'cst__model': [RandomForestClassifier()],
'cst__model__n_estimators': [10, 20],
'cst__model__max_depth': [5, 10],
'cst__model__criterion': ['gini', 'entropy']
,

'cst__model': [SVC()],
'cst__model__C': [10, 20],
'cst__model__kernel': ['linear'],
'cst__model__gamma': [0.0001, 0.001]
,

'cst__model': [LogisticRegression()],
'cst__model__C': [13, 17],
'cst__model__penalty': ['l1', 'l2']
,

'cst__model': [GradientBoostingClassifier()],
'cst__model__n_estimators': [10, 50],
'cst__model__max_depth': [3, 5],
'cst__model__min_samples_leaf': [1, 2]

]

my_report = report()
my_report.griddy(pipeline, parameters, scoring='f1')
my_report.fit_grid(X_train, y_train)
my_report.make_concise_report()
my_report.concise_report








share









$endgroup$
















    0












    $begingroup$


    I initially posted a question on SO. I have come up with an answer for the same. Basically, given two dicts of models and parameters, user can create an object, and get the report in 5 steps.



    Following is the code.



    import numpy as np
    import pandas as pd
    from sklearn.model_selection import train_test_split, GridSearchCV
    from sklearn.ensemble import RandomForestClassifier
    from sklearn.linear_model import LogisticRegression
    from sklearn.svm import SVC
    from sklearn.ensemble import GradientBoostingClassifier
    from sklearn.metrics import f1_score, roc_auc_score, recall_score, precision_score
    from sklearn import datasets
    from sklearn.pipeline import Pipeline
    from sklearn.base import BaseEstimator
    import warnings
    warnings.filterwarnings('ignore')

    cancer = datasets.load_breast_cancer()
    df = pd.DataFrame(cancer.data, columns=cancer.feature_names)
    df['target'] = cancer.target
    target = df['target']
    X_train, X_test, y_train, y_test = train_test_split(df.drop(columns='target', axis=1), target, test_size=0.4, random_state=13, stratify=target)

    class ClfSwitcher(BaseEstimator):

    def __init__(self, model=RandomForestClassifier()):
    """
    A Custom BaseEstimator that can switch between classifiers.
    :param estimator: sklearn object - The classifier
    """

    self.model = model


    def fit(self, X, y=None, **kwargs):
    self.model.fit(X, y)
    return self


    def predict(self, X, y=None):
    return self.model.predict(X)


    def predict_proba(self, X):
    return self.model.predict_proba(X)

    def score(self, X, y):
    return self.estimator.score(X, y)

    class report(ClfSwitcher):
    def __init__(self, *args, **kwargs):
    super().__init__(*args, **kwargs)
    self.grid = None
    self.full_report = None
    self.concise_report = None
    self.scoring_metrics =
    'precision': precision_score,
    'recall': recall_score,
    'f1': f1_score,
    'roc_auc': roc_auc_score



    def griddy(self, pipeLine, parameters, **kwargs):
    self.grid = GridSearchCV(pipeLine, parameters, scoring='accuracy', n_jobs=-1)


    def fit_grid(self, X_train, y_train=None, **kwargs):
    self.grid.fit(X_train, y_train)

    def make_grid_report(self):
    self.full_report = pd.DataFrame(self.grid.cv_results_)

    @staticmethod
    def get_names(col):
    return col.__class__.__name__

    @staticmethod
    def calc_score(col, metric):
    return round(metric(y_test, col.fit(X_train, y_train).predict(X_test)), 4)


    def make_concise_report(self):
    self.concise_report = pd.DataFrame(self.grid.cv_results_)
    self.concise_report['model_names'] = self.concise_report['param_cst__model'].apply(self.get_names)
    self.concise_report = self.concise_report.sort_values(['model_names', 'rank_test_score'], ascending=[True, False])
    .groupby(['model_names']).head(1)[['param_cst__model', 'model_names']]
    .reset_index(drop=True)

    for metric_name, metric_func in self.scoring_metrics.items():
    self.concise_report[metric_name] = self.concise_report['param_cst__model'].apply(self.calc_score, metric=metric_func)

    self.concise_report = self.concise_report[['model_names', 'precision', 'recall', 'f1', 'roc_auc', 'param_cst__model']]

    pipeline = Pipeline([
    ('cst', ClfSwitcher()),
    ])

    parameters = [

    'cst__model': [RandomForestClassifier()],
    'cst__model__n_estimators': [10, 20],
    'cst__model__max_depth': [5, 10],
    'cst__model__criterion': ['gini', 'entropy']
    ,

    'cst__model': [SVC()],
    'cst__model__C': [10, 20],
    'cst__model__kernel': ['linear'],
    'cst__model__gamma': [0.0001, 0.001]
    ,

    'cst__model': [LogisticRegression()],
    'cst__model__C': [13, 17],
    'cst__model__penalty': ['l1', 'l2']
    ,

    'cst__model': [GradientBoostingClassifier()],
    'cst__model__n_estimators': [10, 50],
    'cst__model__max_depth': [3, 5],
    'cst__model__min_samples_leaf': [1, 2]

    ]

    my_report = report()
    my_report.griddy(pipeline, parameters, scoring='f1')
    my_report.fit_grid(X_train, y_train)
    my_report.make_concise_report()
    my_report.concise_report








    share









    $endgroup$














      0












      0








      0





      $begingroup$


      I initially posted a question on SO. I have come up with an answer for the same. Basically, given two dicts of models and parameters, user can create an object, and get the report in 5 steps.



      Following is the code.



      import numpy as np
      import pandas as pd
      from sklearn.model_selection import train_test_split, GridSearchCV
      from sklearn.ensemble import RandomForestClassifier
      from sklearn.linear_model import LogisticRegression
      from sklearn.svm import SVC
      from sklearn.ensemble import GradientBoostingClassifier
      from sklearn.metrics import f1_score, roc_auc_score, recall_score, precision_score
      from sklearn import datasets
      from sklearn.pipeline import Pipeline
      from sklearn.base import BaseEstimator
      import warnings
      warnings.filterwarnings('ignore')

      cancer = datasets.load_breast_cancer()
      df = pd.DataFrame(cancer.data, columns=cancer.feature_names)
      df['target'] = cancer.target
      target = df['target']
      X_train, X_test, y_train, y_test = train_test_split(df.drop(columns='target', axis=1), target, test_size=0.4, random_state=13, stratify=target)

      class ClfSwitcher(BaseEstimator):

      def __init__(self, model=RandomForestClassifier()):
      """
      A Custom BaseEstimator that can switch between classifiers.
      :param estimator: sklearn object - The classifier
      """

      self.model = model


      def fit(self, X, y=None, **kwargs):
      self.model.fit(X, y)
      return self


      def predict(self, X, y=None):
      return self.model.predict(X)


      def predict_proba(self, X):
      return self.model.predict_proba(X)

      def score(self, X, y):
      return self.estimator.score(X, y)

      class report(ClfSwitcher):
      def __init__(self, *args, **kwargs):
      super().__init__(*args, **kwargs)
      self.grid = None
      self.full_report = None
      self.concise_report = None
      self.scoring_metrics =
      'precision': precision_score,
      'recall': recall_score,
      'f1': f1_score,
      'roc_auc': roc_auc_score



      def griddy(self, pipeLine, parameters, **kwargs):
      self.grid = GridSearchCV(pipeLine, parameters, scoring='accuracy', n_jobs=-1)


      def fit_grid(self, X_train, y_train=None, **kwargs):
      self.grid.fit(X_train, y_train)

      def make_grid_report(self):
      self.full_report = pd.DataFrame(self.grid.cv_results_)

      @staticmethod
      def get_names(col):
      return col.__class__.__name__

      @staticmethod
      def calc_score(col, metric):
      return round(metric(y_test, col.fit(X_train, y_train).predict(X_test)), 4)


      def make_concise_report(self):
      self.concise_report = pd.DataFrame(self.grid.cv_results_)
      self.concise_report['model_names'] = self.concise_report['param_cst__model'].apply(self.get_names)
      self.concise_report = self.concise_report.sort_values(['model_names', 'rank_test_score'], ascending=[True, False])
      .groupby(['model_names']).head(1)[['param_cst__model', 'model_names']]
      .reset_index(drop=True)

      for metric_name, metric_func in self.scoring_metrics.items():
      self.concise_report[metric_name] = self.concise_report['param_cst__model'].apply(self.calc_score, metric=metric_func)

      self.concise_report = self.concise_report[['model_names', 'precision', 'recall', 'f1', 'roc_auc', 'param_cst__model']]

      pipeline = Pipeline([
      ('cst', ClfSwitcher()),
      ])

      parameters = [

      'cst__model': [RandomForestClassifier()],
      'cst__model__n_estimators': [10, 20],
      'cst__model__max_depth': [5, 10],
      'cst__model__criterion': ['gini', 'entropy']
      ,

      'cst__model': [SVC()],
      'cst__model__C': [10, 20],
      'cst__model__kernel': ['linear'],
      'cst__model__gamma': [0.0001, 0.001]
      ,

      'cst__model': [LogisticRegression()],
      'cst__model__C': [13, 17],
      'cst__model__penalty': ['l1', 'l2']
      ,

      'cst__model': [GradientBoostingClassifier()],
      'cst__model__n_estimators': [10, 50],
      'cst__model__max_depth': [3, 5],
      'cst__model__min_samples_leaf': [1, 2]

      ]

      my_report = report()
      my_report.griddy(pipeline, parameters, scoring='f1')
      my_report.fit_grid(X_train, y_train)
      my_report.make_concise_report()
      my_report.concise_report








      share









      $endgroup$




      I initially posted a question on SO. I have come up with an answer for the same. Basically, given two dicts of models and parameters, user can create an object, and get the report in 5 steps.



      Following is the code.



      import numpy as np
      import pandas as pd
      from sklearn.model_selection import train_test_split, GridSearchCV
      from sklearn.ensemble import RandomForestClassifier
      from sklearn.linear_model import LogisticRegression
      from sklearn.svm import SVC
      from sklearn.ensemble import GradientBoostingClassifier
      from sklearn.metrics import f1_score, roc_auc_score, recall_score, precision_score
      from sklearn import datasets
      from sklearn.pipeline import Pipeline
      from sklearn.base import BaseEstimator
      import warnings
      warnings.filterwarnings('ignore')

      cancer = datasets.load_breast_cancer()
      df = pd.DataFrame(cancer.data, columns=cancer.feature_names)
      df['target'] = cancer.target
      target = df['target']
      X_train, X_test, y_train, y_test = train_test_split(df.drop(columns='target', axis=1), target, test_size=0.4, random_state=13, stratify=target)

      class ClfSwitcher(BaseEstimator):

      def __init__(self, model=RandomForestClassifier()):
      """
      A Custom BaseEstimator that can switch between classifiers.
      :param estimator: sklearn object - The classifier
      """

      self.model = model


      def fit(self, X, y=None, **kwargs):
      self.model.fit(X, y)
      return self


      def predict(self, X, y=None):
      return self.model.predict(X)


      def predict_proba(self, X):
      return self.model.predict_proba(X)

      def score(self, X, y):
      return self.estimator.score(X, y)

      class report(ClfSwitcher):
      def __init__(self, *args, **kwargs):
      super().__init__(*args, **kwargs)
      self.grid = None
      self.full_report = None
      self.concise_report = None
      self.scoring_metrics =
      'precision': precision_score,
      'recall': recall_score,
      'f1': f1_score,
      'roc_auc': roc_auc_score



      def griddy(self, pipeLine, parameters, **kwargs):
      self.grid = GridSearchCV(pipeLine, parameters, scoring='accuracy', n_jobs=-1)


      def fit_grid(self, X_train, y_train=None, **kwargs):
      self.grid.fit(X_train, y_train)

      def make_grid_report(self):
      self.full_report = pd.DataFrame(self.grid.cv_results_)

      @staticmethod
      def get_names(col):
      return col.__class__.__name__

      @staticmethod
      def calc_score(col, metric):
      return round(metric(y_test, col.fit(X_train, y_train).predict(X_test)), 4)


      def make_concise_report(self):
      self.concise_report = pd.DataFrame(self.grid.cv_results_)
      self.concise_report['model_names'] = self.concise_report['param_cst__model'].apply(self.get_names)
      self.concise_report = self.concise_report.sort_values(['model_names', 'rank_test_score'], ascending=[True, False])
      .groupby(['model_names']).head(1)[['param_cst__model', 'model_names']]
      .reset_index(drop=True)

      for metric_name, metric_func in self.scoring_metrics.items():
      self.concise_report[metric_name] = self.concise_report['param_cst__model'].apply(self.calc_score, metric=metric_func)

      self.concise_report = self.concise_report[['model_names', 'precision', 'recall', 'f1', 'roc_auc', 'param_cst__model']]

      pipeline = Pipeline([
      ('cst', ClfSwitcher()),
      ])

      parameters = [

      'cst__model': [RandomForestClassifier()],
      'cst__model__n_estimators': [10, 20],
      'cst__model__max_depth': [5, 10],
      'cst__model__criterion': ['gini', 'entropy']
      ,

      'cst__model': [SVC()],
      'cst__model__C': [10, 20],
      'cst__model__kernel': ['linear'],
      'cst__model__gamma': [0.0001, 0.001]
      ,

      'cst__model': [LogisticRegression()],
      'cst__model__C': [13, 17],
      'cst__model__penalty': ['l1', 'l2']
      ,

      'cst__model': [GradientBoostingClassifier()],
      'cst__model__n_estimators': [10, 50],
      'cst__model__max_depth': [3, 5],
      'cst__model__min_samples_leaf': [1, 2]

      ]

      my_report = report()
      my_report.griddy(pipeline, parameters, scoring='f1')
      my_report.fit_grid(X_train, y_train)
      my_report.make_concise_report()
      my_report.concise_report






      python machine-learning





      share












      share










      share



      share










      asked 6 mins ago









      scientific_explorerscientific_explorer

      112




      112




















          0






          active

          oldest

          votes












          Your Answer





          StackExchange.ifUsing("editor", function ()
          return StackExchange.using("mathjaxEditing", function ()
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
          );
          );
          , "mathjax-editing");

          StackExchange.ifUsing("editor", function ()
          StackExchange.using("externalEditor", function ()
          StackExchange.using("snippets", function ()
          StackExchange.snippets.init();
          );
          );
          , "code-snippets");

          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "196"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f216812%2fmodified-and-created-a-new-python-class-for-generating-a-report-of-metrics-for-m%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Code Review Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f216812%2fmodified-and-created-a-new-python-class-for-generating-a-report-of-metrics-for-m%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          名間水力發電廠 目录 沿革 設施 鄰近設施 註釋 外部連結 导航菜单23°50′10″N 120°42′41″E / 23.83611°N 120.71139°E / 23.83611; 120.7113923°50′10″N 120°42′41″E / 23.83611°N 120.71139°E / 23.83611; 120.71139計畫概要原始内容臺灣第一座BOT 模式開發的水力發電廠-名間水力電廠名間水力發電廠 水利署首件BOT案原始内容《小檔案》名間電廠 首座BOT水力發電廠原始内容名間電廠BOT - 經濟部水利署中區水資源局

          格濟夫卡 參考資料 导航菜单51°3′40″N 34°2′21″E / 51.06111°N 34.03917°E / 51.06111; 34.03917ГезівкаПогода в селі 编辑或修订